# NaNStatistics

NaNStatistics.histcounts!Method
histcounts!(N, x, y, xedges::AbstractRange, yedges::AbstractRange)

Simple 2D histogram; as histcounts, but in-place, adding counts to the first length(xedges)-1 columns and the first length(yedges)-1 rows of N elements of Array N.

Note that counts will be added to N, not overwrite N, allowing you to produce cumulative histograms. However, this means you will have to initialize N with zeros before first use.

source
NaNStatistics.histcounts!Method
histcounts!(N, x, xedges::AbstractRange)

Simple 1D histogram; as histcounts, but in-place, adding counts to the first length(xedges)-1 elements of Array N.

Note that counts will be added to N, not overwrite N, allowing you to produce cumulative histograms. However, this means you will have to initialize N with zeros before first use.

source
NaNStatistics.histcountsMethod
histcounts(x, xedges::AbstractRange; T=Int64)::Vector{T}

A 1D histogram, ignoring NaNs: calculate the number of x values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

By default, the counts are returned as Int64s, though this can be changed by specifying an output type with the optional keyword argument T.

Examples

julia> b = 10 * rand(100000);

julia> histcounts(b, 0:1:10)
10-element Vector{Int64}:
10054
9987
9851
9971
9832
10033
10250
10039
9950
10033
source
NaNStatistics.histcountsMethod
histcounts(x, y, xedges::AbstractRange, yedges::AbstractRange; T=Int64)::Matrix{T}

A 2D histogram, ignoring NaNs: calculate the number of x, y pairs that fall into each square of a 2D grid of equally-spaced square bins with edges specified by xedges and yedges.

The resulting matrix N of counts is oriented with the lowest x and y bins in N[1,1], where the first (vertical / row) dimension of N corresponds to the y axis (with size(N,1) == length(yedges)-1) and the second (horizontal / column) dimension of N corresponds to the x axis (with size(N,2) == length(xedges)-1).

By default, the counts are returned as Int64s, though this can be changed by specifying an output type with the optional keyword argument T.

Examples

julia> x = y = 0.5:9.5;

julia> xedges = yedges = 0:10;

julia> N = histcounts(x,y,xedges,yedges)
10×10 Matrix{Int64}:
1  0  0  0  0  0  0  0  0  0
0  1  0  0  0  0  0  0  0  0
0  0  1  0  0  0  0  0  0  0
0  0  0  1  0  0  0  0  0  0
0  0  0  0  1  0  0  0  0  0
0  0  0  0  0  1  0  0  0  0
0  0  0  0  0  0  1  0  0  0
0  0  0  0  0  0  0  1  0  0
0  0  0  0  0  0  0  0  1  0
0  0  0  0  0  0  0  0  0  1
source
NaNStatistics.inpctileMethod
inpctile(A, p::Number; dims)

Return a boolean array that identifies which values of the iterable collection A fall within the central pth percentile, optionally along a dimension specified by dims.

A valid percentile value must satisfy 0 <= p <= 100.

source
NaNStatistics.movmeanMethod
movmean(x::AbstractVecOrMat, n::Number)

Simple moving average of x in 1 or 2 dimensions, spanning n bins (or n*n in 2D), returning an array of the same size as x.

For the resulting moving average to be symmetric, n must be an odd integer; if n is not an odd integer, the first odd integer greater than n will be used instead.

source
NaNStatistics.nanaadMethod
nanaad(A; dims)

Mean (average) absolute deviation from the mean, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims. Note that for a Normal distribution, sigma = 1.253 * AAD.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanbinmean!Method
nanbinmean!(MU, N, x, y, z, xedges::AbstractRange, yedges::AbstractRange)

Ignoring NaNs, fill the matrix MU with the means and N with the counts of non-NAN z values that fall into a 2D grid of x and y bins defined by xedges and yedges. The independent variables x and y, as well as the dependent variable z, are all expected as 1D vectors (any subtype of AbstractVector).

The output matrices MU and N must be the same size, and must each have length(yedges)-1 rows and length(xedges)-1 columns.

source
NaNStatistics.nanbinmean!Method
nanbinmean!(MU, [N], x, y, xedges::AbstractRange)

Ignoring NaNs, fill the array MU with the means (and optionally N with the counts) of non-NAN y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

The array of x data should given as a one-dimensional array (any subtype of AbstractVector) and y as either a 1-d or 2-d array (any subtype of AbstractVecOrMat).

The output arrays MU and N must be the same size, and must have the same number of columns as y; if y is a 2-d array (matrix), then each column of y will be treated as a separate variable.

source
NaNStatistics.nanbinmeanMethod
nanbinmean(x, y, xedges::AbstractRange)

Ignoring NaNs, calculate the mean of y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

The array of x data should be given as a one-dimensional array (any subtype of AbstractVector) and y as either a 1-d or 2-d array (any subtype of AbstractVecOrMat). If y is a 2-d array, then each column of y will be treated as a separate variable.

Examples

julia> nanbinmean([1:100..., 1], [1:100..., NaN], 0:25:100)
4-element Vector{Float64}:
13.0
38.0
63.0
87.5

julia> nanbinmean(1:100, reshape(1:300,100,3), 0:25:100)
4×3 Matrix{Float64}:
13.0  113.0  213.0
38.0  138.0  238.0
63.0  163.0  263.0
87.5  187.5  287.5
source
NaNStatistics.nanbinmeanMethod
nanbinmean(x, y, xedges::AbstractRange)

Ignoring NaNs, calculate the weighted mean of y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

The array of x data should given as a one-dimensional array (any subtype of AbstractVector) and y as either a 1-d or 2-d array (any subtype of AbstractVecOrMat). If y is a 2-d array, then each column of y will be treated as a separate variable.

source
NaNStatistics.nanbinmeanMethod
nanbinmean(x, y, z, xedges, yedges)

Ignoring NaNs, calculate the mean of z values that fall into a 2D grid of x and y bins with bin edges defined by xedges and yedges. The independent variables x and y, as well as the dependent variable z, are all expected as 1D vectors (any subtype of AbstractVector).

Examples

julia> x = y = z = 0.5:9.5;

julia> xedges = yedges = 0:10;

julia> nanbinmean(x,y,z,xedges,yedges)
10×10 Matrix{Float64}:
0.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
NaN      1.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
NaN    NaN      2.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN
NaN    NaN    NaN      3.5  NaN    NaN    NaN    NaN    NaN    NaN
NaN    NaN    NaN    NaN      4.5  NaN    NaN    NaN    NaN    NaN
NaN    NaN    NaN    NaN    NaN      5.5  NaN    NaN    NaN    NaN
NaN    NaN    NaN    NaN    NaN    NaN      6.5  NaN    NaN    NaN
NaN    NaN    NaN    NaN    NaN    NaN    NaN      7.5  NaN    NaN
NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN      8.5  NaN
NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN      9.5
source
NaNStatistics.nanbinmedian!Method
nanbinmedian!(M, [N], x, y, xedges::AbstractRange)

Fill the array M with the medians (and optionally N with the counts) of non-NaN y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

If y is a 2-d array (matrix), each column will be treated as a separate variable

source
NaNStatistics.nanbinmedianMethod
nanbinmedian(x, y, xedges::AbstractRange)

Calculate the median, ignoring NaNs, of y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

If y is a 2-d array (matrix), each column will be treated as a separate variable

Examples

julia> nanbinmedian([1:100..., 1], [1:100..., NaN], 0:25:100)
4-element Vector{Float64}:
12.5
37.0
62.0
87.0

julia> nanbinmedian(1:100, reshape(1:300,100,3), 0:25:100)
4×3 Matrix{Float64}:
12.5  112.5  212.5
37.0  137.0  237.0
62.0  162.0  262.0
87.0  187.0  287.0
source
NaNStatistics.nancorMethod
nancor(X::AbstractMatrix; dims::Int=1)

Compute the (Pearson's product-moment) correlation matrix of the matrix X, along dimension dims. As Statistics.cor, but ignoring NaNs.

source
NaNStatistics.nancorMethod
nancor(x::AbstractVector, y::AbstractVector)

Compute the (Pearson's product-moment) correlation between the vectors x and y. As Statistics.cor, but ignoring NaNs.

Equivalent to nancov(x,y) / (nanstd(x) * nanstd(y)).

source
NaNStatistics.nancovMethod
nancov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)

Compute the covariance matrix of the matrix X, along dimension dims. As Statistics.cov, but ignoring NaNs.

If corrected is true as is the default, Bessel's correction will be applied, such that the sum is scaled by n-1 rather than n, where n = length(x).

source
NaNStatistics.nancovMethod
nancov(x::AbstractVector, y::AbstractVector; corrected::Bool=true)

Compute the covariance between the vectors x and y. As Statistics.cov, but ignoring NaNs.

If corrected is true as is the default, Bessel's correction will be applied, such that the sum is scaled by n-1 rather than n, where n = length(x).

source
NaNStatistics.nanextremaMethod
nanextrema(A; dims)

Find the extrema (maximum & minimum) of an indexable collection A, ignoring NaNs, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanmadMethod
nanmad(A; dims)

Median absolute deviation from the median, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims. Note that for a Normal distribution, sigma = 1.4826 * MAD.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanmask!Method
nanmask!(mask, A)

Fill a Boolean mask of dimensions size(A) that is false wherever A is NaN

source
NaNStatistics.nanmaskMethod
nanmask(A)

Create a Boolean mask of dimensions size(A) that is false wherever A is NaN

source
NaNStatistics.nanmaximumMethod
nanmaximum(A; dims)

Find the largest non-NaN value of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanmeanMethod
nanmean(A, W; dims)

Ignoring NaNs, calculate the weighted mean of an indexable collection A, optionally along dimensions specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanmeanMethod
nanmean(A; dims)

Compute the mean of all non-NaN elements in A, optionally over dimensions specified by dims. As Statistics.mean, but ignoring NaNs.

As an alternative to dims, nanmean also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1  2
3  4

julia> nanmean(A, dims=1)
1×2 Matrix{Float64}:
2.0  3.0

julia> nanmean(A, dims=2)
2×1 Matrix{Float64}:
1.5
3.5
source
NaNStatistics.nanmedianMethod
nanmedian(A; dims)

Calculate the median, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanminimumMethod
nanminimum(A; dims)

As minimum but ignoring NaNs: Find the smallest non-NaN value of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanpctileMethod
nanpctile(A, p; dims

Find the pth percentile of an indexable collection A, ignoring NaNs, optionally along a dimension specified by dims.

A valid percentile value must satisfy 0 <= p <= 100.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanrangeMethod
nanrange(A; dims)

Calculate the range (maximum - minimum) of an indexable collection A, ignoring NaNs, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanstandardize!Method
nanstandardize!(A::Array{<:AbstractFloat}; dims)

Rescale A to unit variance and zero mean i.e. A .= (A .- nanmean(A)) ./ nanstd(A)

source
NaNStatistics.nanstdMethod
nanstd(A, W; dims)

Calculate the weighted standard deviation, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source
NaNStatistics.nanstdMethod
nanstd(A; dims=:, mean=nothing, corrected=true)

Compute the variance of all non-NaN elements in A, optionally over dimensions specified by dims. As Statistics.var, but ignoring NaNs.

A precomputed mean may optionally be provided, which results in a somewhat faster calculation. If corrected is true, then Bessel's correction is applied, such that the sum is divided by n-1 rather than n.

As an alternative to dims, nanstd also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1  2
3  4

julia> nanstd(A, dims=1)
1×2 Matrix{Float64}:
1.41421  1.41421

julia> nanstd(A, dims=2)
2×1 Matrix{Float64}:
0.7071067811865476
0.7071067811865476
source
NaNStatistics.nansumMethod
nansum(A; dims)

Calculate the sum of an indexable collection A, ignoring NaNs, optionally along dimensions specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1  2
3  4

julia> nansum(A, dims=1)
1×2 Matrix{Int64}:
4  6

julia> nansum(A, dims=2)
2×1 Matrix{Int64}:
3
7
source
NaNStatistics.nanvarMethod
nanvar(A; dims=:, mean=nothing, corrected=true)

Compute the variance of all non-NaN elements in A, optionally over dimensions specified by dims. As Statistics.var, but ignoring NaNs.

A precomputed mean may optionally be provided, which results in a somewhat faster calculation. If corrected is true, then Bessel's correction is applied, such that the sum is divided by n-1 rather than n.

As an alternative to dims, nanvar also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
1  2
3  4

julia> nanvar(A, dims=1)
1×2 Matrix{Float64}:
2.0  2.0

julia> nanvar(A, dims=2)
2×1 Matrix{Float64}:
0.5
0.5
source