# NaNStatistics

`NaNStatistics.histcounts`

`NaNStatistics.histcounts`

`NaNStatistics.histcounts!`

`NaNStatistics.histcounts!`

`NaNStatistics.inpctile`

`NaNStatistics.movmean`

`NaNStatistics.nanaad`

`NaNStatistics.nanadd`

`NaNStatistics.nanadd!`

`NaNStatistics.nanbinmean`

`NaNStatistics.nanbinmean`

`NaNStatistics.nanbinmean`

`NaNStatistics.nanbinmean!`

`NaNStatistics.nanbinmean!`

`NaNStatistics.nanbinmedian`

`NaNStatistics.nanbinmedian!`

`NaNStatistics.nancor`

`NaNStatistics.nancor`

`NaNStatistics.nancov`

`NaNStatistics.nancov`

`NaNStatistics.nanextrema`

`NaNStatistics.nanmad`

`NaNStatistics.nanmask`

`NaNStatistics.nanmask!`

`NaNStatistics.nanmax`

`NaNStatistics.nanmaximum`

`NaNStatistics.nanmean`

`NaNStatistics.nanmean`

`NaNStatistics.nanmedian`

`NaNStatistics.nanmin`

`NaNStatistics.nanminimum`

`NaNStatistics.nanpctile`

`NaNStatistics.nanrange`

`NaNStatistics.nanstandardize`

`NaNStatistics.nanstandardize!`

`NaNStatistics.nanstd`

`NaNStatistics.nanstd`

`NaNStatistics.nansum`

`NaNStatistics.nanvar`

`NaNStatistics.zeronan!`

`NaNStatistics.histcounts!`

— Method`histcounts!(N, x, y, xedges::AbstractRange, yedges::AbstractRange)`

Simple 2D histogram; as `histcounts`

, but in-place, adding counts to the first `length(xedges)-1`

columns and the first `length(yedges)-1`

rows of `N`

elements of Array `N`

.

Note that counts will be added to `N`

, not overwrite `N`

, allowing you to produce cumulative histograms. However, this means you will have to initialize `N`

with zeros before first use.

`NaNStatistics.histcounts!`

— Method`histcounts!(N, x, xedges::AbstractRange)`

Simple 1D histogram; as `histcounts`

, but in-place, adding counts to the first `length(xedges)-1`

elements of Array `N`

.

Note that counts will be added to `N`

, not overwrite `N`

, allowing you to produce cumulative histograms. However, this means you will have to initialize `N`

with zeros before first use.

`NaNStatistics.histcounts`

— Method`histcounts(x, xedges::AbstractRange; T=Int64)::Vector{T}`

A 1D histogram, ignoring NaNs: calculate the number of `x`

values that fall into each of `length(xedges)-1`

equally spaced bins along the `x`

axis with bin edges specified by `xedges`

.

By default, the counts are returned as `Int64`

s, though this can be changed by specifying an output type with the optional keyword argument `T`

.

**Examples**

```
julia> b = 10 * rand(100000);
julia> histcounts(b, 0:1:10)
10-element Vector{Int64}:
10054
9987
9851
9971
9832
10033
10250
10039
9950
10033
```

`NaNStatistics.histcounts`

— Method`histcounts(x, y, xedges::AbstractRange, yedges::AbstractRange; T=Int64)::Matrix{T}`

A 2D histogram, ignoring NaNs: calculate the number of `x, y`

pairs that fall into each square of a 2D grid of equally-spaced square bins with edges specified by `xedges`

and `yedges`

.

The resulting matrix `N`

of counts is oriented with the lowest x and y bins in `N[1,1]`

, where the first (vertical / row) dimension of `N`

corresponds to the y axis (with `size(N,1) == length(yedges)-1`

) and the second (horizontal / column) dimension of `N`

corresponds to the x axis (with `size(N,2) == length(xedges)-1`

).

By default, the counts are returned as `Int64`

s, though this can be changed by specifying an output type with the optional keyword argument `T`

.

**Examples**

```
julia> x = y = 0.5:9.5;
julia> xedges = yedges = 0:10;
julia> N = histcounts(x,y,xedges,yedges)
10×10 Matrix{Int64}:
1 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 1
```

`NaNStatistics.inpctile`

— Method`inpctile(A, p::Number; dims)`

Return a boolean array that identifies which values of the iterable collection `A`

fall within the central `p`

th percentile, optionally along a dimension specified by `dims`

.

A valid percentile value must satisfy 0 <= `p`

<= 100.

`NaNStatistics.movmean`

— Method`movmean(x::AbstractVecOrMat, n::Number)`

Simple moving average of `x`

in 1 or 2 dimensions, spanning `n`

bins (or n*n in 2D), returning an array of the same size as `x`

.

For the resulting moving average to be symmetric, `n`

must be an odd integer; if `n`

is not an odd integer, the first odd integer greater than `n`

will be used instead.

`NaNStatistics.nanaad`

— Method`nanaad(A; dims)`

Mean (average) absolute deviation from the mean, ignoring NaNs, of an indexable collection `A`

, optionally along a dimension specified by `dims`

. Note that for a Normal distribution, sigma = 1.253 * AAD.

Also supports the `dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanadd!`

— Method`nanadd!(A, B)`

Add the non-NaN elements of `B`

to `A`

, treating NaNs as zeros

`NaNStatistics.nanadd`

— Method`nanadd(A, B)`

Add the non-NaN elements of A and B, treating NaNs as zeros

`NaNStatistics.nanbinmean!`

— Method`nanbinmean!(MU, N, x, y, z, xedges::AbstractRange, yedges::AbstractRange)`

Ignoring NaNs, fill the matrix `MU`

with the means and `N`

with the counts of non-NAN `z`

values that fall into a 2D grid of x and y bins defined by `xedges`

and `yedges`

. The independent variables `x`

and `y`

, as well as the dependent variable `z`

, are all expected as 1D vectors (any subtype of AbstractVector).

The output matrices `MU`

and `N`

must be the same size, and must each have `length(yedges)-1`

rows and `length(xedges)-1`

columns.

`NaNStatistics.nanbinmean!`

— Method`nanbinmean!(MU, [N], x, y, xedges::AbstractRange)`

Ignoring NaNs, fill the array `MU`

with the means (and optionally `N`

with the counts) of non-NAN `y`

values that fall into each of `length(xedges)-1`

equally spaced bins along the `x`

axis with bin edges specified by `xedges`

.

The array of `x`

data should given as a one-dimensional array (any subtype of AbstractVector) and `y`

as either a 1-d or 2-d array (any subtype of AbstractVecOrMat).

The output arrays `MU`

and `N`

must be the same size, and must have the same number of columns as `y`

; if `y`

is a 2-d array (matrix), then each column of `y`

will be treated as a separate variable.

`NaNStatistics.nanbinmean`

— Method`nanbinmean(x, y, xedges::AbstractRange)`

Ignoring NaNs, calculate the mean of `y`

values that fall into each of `length(xedges)-1`

equally spaced bins along the `x`

axis with bin edges specified by `xedges`

.

The array of `x`

data should be given as a one-dimensional array (any subtype of AbstractVector) and `y`

as either a 1-d or 2-d array (any subtype of AbstractVecOrMat). If `y`

is a 2-d array, then each column of `y`

will be treated as a separate variable.

**Examples**

```
julia> nanbinmean([1:100..., 1], [1:100..., NaN], 0:25:100)
4-element Vector{Float64}:
13.0
38.0
63.0
87.5
julia> nanbinmean(1:100, reshape(1:300,100,3), 0:25:100)
4×3 Matrix{Float64}:
13.0 113.0 213.0
38.0 138.0 238.0
63.0 163.0 263.0
87.5 187.5 287.5
```

`NaNStatistics.nanbinmean`

— Method`nanbinmean(x, y, xedges::AbstractRange)`

Ignoring NaNs, calculate the weighted mean of `y`

values that fall into each of `length(xedges)-1`

equally spaced bins along the `x`

axis with bin edges specified by `xedges`

.

The array of `x`

data should given as a one-dimensional array (any subtype of AbstractVector) and `y`

as either a 1-d or 2-d array (any subtype of AbstractVecOrMat). If `y`

is a 2-d array, then each column of `y`

will be treated as a separate variable.

`NaNStatistics.nanbinmean`

— Method`nanbinmean(x, y, z, xedges, yedges)`

Ignoring NaNs, calculate the mean of `z`

values that fall into a 2D grid of x and y bins with bin edges defined by `xedges`

and `yedges`

. The independent variables `x`

and `y`

, as well as the dependent variable `z`

, are all expected as 1D vectors (any subtype of AbstractVector).

**Examples**

```
julia> x = y = z = 0.5:9.5;
julia> xedges = yedges = 0:10;
julia> nanbinmean(x,y,z,xedges,yedges)
10×10 Matrix{Float64}:
0.5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
NaN 1.5 NaN NaN NaN NaN NaN NaN NaN NaN
NaN NaN 2.5 NaN NaN NaN NaN NaN NaN NaN
NaN NaN NaN 3.5 NaN NaN NaN NaN NaN NaN
NaN NaN NaN NaN 4.5 NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN 5.5 NaN NaN NaN NaN
NaN NaN NaN NaN NaN NaN 6.5 NaN NaN NaN
NaN NaN NaN NaN NaN NaN NaN 7.5 NaN NaN
NaN NaN NaN NaN NaN NaN NaN NaN 8.5 NaN
NaN NaN NaN NaN NaN NaN NaN NaN NaN 9.5
```

`NaNStatistics.nanbinmedian!`

— Method`nanbinmedian!(M, [N], x, y, xedges::AbstractRange)`

Fill the array `M`

with the medians (and optionally `N`

with the counts) of non-NaN `y`

values that fall into each of `length(xedges)-1`

equally spaced bins along the `x`

axis with bin edges specified by `xedges`

.

If `y`

is a 2-d array (matrix), each column will be treated as a separate variable

`NaNStatistics.nanbinmedian`

— Method`nanbinmedian(x, y, xedges::AbstractRange)`

Calculate the median, ignoring NaNs, of y values that fall into each of `length(xedges)-1`

equally spaced bins along the `x`

axis with bin edges specified by `xedges`

.

If `y`

is a 2-d array (matrix), each column will be treated as a separate variable

**Examples**

```
julia> nanbinmedian([1:100..., 1], [1:100..., NaN], 0:25:100)
4-element Vector{Float64}:
12.5
37.0
62.0
87.0
julia> nanbinmedian(1:100, reshape(1:300,100,3), 0:25:100)
4×3 Matrix{Float64}:
12.5 112.5 212.5
37.0 137.0 237.0
62.0 162.0 262.0
87.0 187.0 287.0
```

`NaNStatistics.nancor`

— Method`nancor(X::AbstractMatrix; dims::Int=1)`

Compute the (Pearson's product-moment) correlation matrix of the matrix `X`

, along dimension `dims`

. As `Statistics.cor`

, but ignoring `NaN`

s.

`NaNStatistics.nancor`

— Method`nancor(x::AbstractVector, y::AbstractVector)`

Compute the (Pearson's product-moment) correlation between the vectors `x`

and `y`

. As `Statistics.cor`

, but ignoring `NaN`

s.

Equivalent to `nancov(x,y) / (nanstd(x) * nanstd(y))`

.

`NaNStatistics.nancov`

— Method`nancov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)`

Compute the covariance matrix of the matrix `X`

, along dimension `dims`

. As `Statistics.cov`

, but ignoring `NaN`

s.

If `corrected`

is `true`

as is the default, *Bessel's correction* will be applied, such that the sum is scaled by `n-1`

rather than `n`

, where `n = length(x)`

.

`NaNStatistics.nancov`

— Method`nancov(x::AbstractVector, y::AbstractVector; corrected::Bool=true)`

Compute the covariance between the vectors `x`

and `y`

. As `Statistics.cov`

, but ignoring `NaN`

s.

If `corrected`

is `true`

as is the default, *Bessel's correction* will be applied, such that the sum is scaled by `n-1`

rather than `n`

, where `n = length(x)`

.

`NaNStatistics.nanextrema`

— Method`nanextrema(A; dims)`

Find the extrema (maximum & minimum) of an indexable collection `A`

, ignoring NaNs, optionally along a dimension specified by `dims`

.

Also supports the `dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanmad`

— Method`nanmad(A; dims)`

Median absolute deviation from the median, ignoring NaNs, of an indexable collection `A`

, optionally along a dimension specified by `dims`

. Note that for a Normal distribution, sigma = 1.4826 * MAD.

Also supports the `dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanmask!`

— Method`nanmask!(mask, A)`

Fill a Boolean mask of dimensions `size(A)`

that is false wherever `A`

is `NaN`

`NaNStatistics.nanmask`

— Method`nanmask(A)`

Create a Boolean mask of dimensions `size(A)`

that is false wherever `A`

is `NaN`

`NaNStatistics.nanmax`

— Method`nanmax(a,b)`

As `max(a,b)`

, but if either argument is `NaN`

, return the other one

`NaNStatistics.nanmaximum`

— Method`nanmaximum(A; dims)`

Find the largest non-NaN value of an indexable collection `A`

, optionally along a dimension specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanmean`

— Method`nanmean(A, W; dims)`

Ignoring NaNs, calculate the weighted mean of an indexable collection `A`

, optionally along dimensions specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanmean`

— Method`nanmean(A; dims)`

Compute the mean of all non-`NaN`

elements in `A`

, optionally over dimensions specified by `dims`

. As `Statistics.mean`

, but ignoring `NaN`

s.

As an alternative to `dims`

, `nanmean`

also supports the `dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

**Examples**

```
julia> using NaNStatistics
julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1 2
3 4
julia> nanmean(A, dims=1)
1×2 Matrix{Float64}:
2.0 3.0
julia> nanmean(A, dims=2)
2×1 Matrix{Float64}:
1.5
3.5
```

`NaNStatistics.nanmedian`

— Method`nanmedian(A; dims)`

Calculate the median, ignoring NaNs, of an indexable collection `A`

, optionally along a dimension specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanmin`

— Method`nanmin(a,b)`

As `min(a,b)`

, but if either argument is `NaN`

, return the other one

`NaNStatistics.nanminimum`

— Method`nanminimum(A; dims)`

As `minimum`

but ignoring `NaN`

s: Find the smallest non-`NaN`

value of an indexable collection `A`

, optionally along a dimension specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanpctile`

— Method`nanpctile(A, p; dims`

Find the `p`

th percentile of an indexable collection `A`

, ignoring NaNs, optionally along a dimension specified by `dims`

.

A valid percentile value must satisfy 0 <= `p`

<= 100.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanrange`

— Method`nanrange(A; dims)`

Calculate the range (maximum - minimum) of an indexable collection `A`

, ignoring NaNs, optionally along a dimension specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanstandardize!`

— Method`nanstandardize!(A::Array{<:AbstractFloat}; dims)`

Rescale `A`

to unit variance and zero mean i.e. `A .= (A .- nanmean(A)) ./ nanstd(A)`

`NaNStatistics.nanstandardize`

— Method`nanstandardize(A; dims)`

Rescale a copy of `A`

to unit variance and zero mean i.e. `(A .- nanmean(A)) ./ nanstd(A)`

`NaNStatistics.nanstd`

— Method`nanstd(A, W; dims)`

Calculate the weighted standard deviation, ignoring NaNs, of an indexable collection `A`

, optionally along a dimension specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

`NaNStatistics.nanstd`

— Method`nanstd(A; dims=:, mean=nothing, corrected=true)`

Compute the variance of all non-`NaN`

elements in `A`

, optionally over dimensions specified by `dims`

. As `Statistics.var`

, but ignoring `NaN`

s.

A precomputed `mean`

may optionally be provided, which results in a somewhat faster calculation. If `corrected`

is `true`

, then *Bessel's correction* is applied, such that the sum is divided by `n-1`

rather than `n`

.

As an alternative to `dims`

, `nanstd`

also supports the `dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

**Examples**

```
julia> using NaNStatistics
julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1 2
3 4
julia> nanstd(A, dims=1)
1×2 Matrix{Float64}:
1.41421 1.41421
julia> nanstd(A, dims=2)
2×1 Matrix{Float64}:
0.7071067811865476
0.7071067811865476
```

`NaNStatistics.nansum`

— Method`nansum(A; dims)`

Calculate the sum of an indexable collection `A`

, ignoring NaNs, optionally along dimensions specified by `dims`

.

`dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

**Examples**

```
julia> using NaNStatistics
julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1 2
3 4
julia> nansum(A, dims=1)
1×2 Matrix{Int64}:
4 6
julia> nansum(A, dims=2)
2×1 Matrix{Int64}:
3
7
```

`NaNStatistics.nanvar`

— Method`nanvar(A; dims=:, mean=nothing, corrected=true)`

Compute the variance of all non-`NaN`

elements in `A`

, optionally over dimensions specified by `dims`

. As `Statistics.var`

, but ignoring `NaN`

s.

A precomputed `mean`

may optionally be provided, which results in a somewhat faster calculation. If `corrected`

is `true`

, then *Bessel's correction* is applied, such that the sum is divided by `n-1`

rather than `n`

.

As an alternative to `dims`

, `nanvar`

also supports the `dim`

keyword, which behaves identically to `dims`

, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

**Examples**

```
julia> using NaNStatistics
julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1 2
3 4
julia> nanvar(A, dims=1)
1×2 Matrix{Float64}:
2.0 2.0
julia> nanvar(A, dims=2)
2×1 Matrix{Float64}:
0.5
0.5
```

`NaNStatistics.zeronan!`

— Method`zeronan!(A)`

Replace all `NaN`

s in A with zeros of the same type