NaNStatistics

Documentation for the NaNStatistics.jl package.

NaNStatistics.countnans
NaNStatistics.countnotnans
NaNStatistics.histcountindices
NaNStatistics.histcountindices!
NaNStatistics.histcounts
NaNStatistics.histcounts
NaNStatistics.histcounts!
NaNStatistics.histcounts!
NaNStatistics.histkurtosis
NaNStatistics.histmean
NaNStatistics.histskewness
NaNStatistics.histstd
NaNStatistics.histvar
NaNStatistics.inpctile
NaNStatistics.movmean
NaNStatistics.movmean
NaNStatistics.movsum
NaNStatistics.nanaad
NaNStatistics.nanadd
NaNStatistics.nanadd!
NaNStatistics.nanargmax
NaNStatistics.nanargmin
NaNStatistics.nanbinmean
NaNStatistics.nanbinmean
NaNStatistics.nanbinmean
NaNStatistics.nanbinmean!
NaNStatistics.nanbinmean!
NaNStatistics.nanbinmedian
NaNStatistics.nanbinmedian!
NaNStatistics.nancor
NaNStatistics.nancor
NaNStatistics.nancov
NaNStatistics.nancov
NaNStatistics.nancovem
NaNStatistics.nancovem
NaNStatistics.nancumsum
NaNStatistics.nanextrema
NaNStatistics.nankurtosis
NaNStatistics.nanlogcumsumexp
NaNStatistics.nanlogsumexp
NaNStatistics.nanmad
NaNStatistics.nanmad!
NaNStatistics.nanmask
NaNStatistics.nanmask!
NaNStatistics.nanmax
NaNStatistics.nanmaximum
NaNStatistics.nanmean
NaNStatistics.nanmean
NaNStatistics.nanmean!
NaNStatistics.nanmedian
NaNStatistics.nanmedian!
NaNStatistics.nanmin
NaNStatistics.nanminimum
NaNStatistics.nanpctile
NaNStatistics.nanpctile!
NaNStatistics.nanquantile
NaNStatistics.nanquantile!
NaNStatistics.nanrange
NaNStatistics.nansem
NaNStatistics.nanskewness
NaNStatistics.nanstandardize
NaNStatistics.nanstandardize!
NaNStatistics.nanstd
NaNStatistics.nanstd
NaNStatistics.nansum
NaNStatistics.nansum!
NaNStatistics.nanvar
NaNStatistics.zeronan!

NaNStatistics.countnans — Method

countnans(A)

Return the number of elements of A that are NaNs.

source

NaNStatistics.countnotnans — Method

countnonnans(A)

Return the number of elements of A that are not NaNs.

source

NaNStatistics.histcountindices! — Method

histcountindices!(N, bin, x, xedges::AbstractRange)

Simple 1D histogram; as histcounts!, but also recording the bin index of each x value.

source

NaNStatistics.histcountindices — Method

histcountindices(x, xedges::AbstractRange; T=Int64)

A 1D histogram, ignoring NaNs; as histcounts but also returning a vector of the bin index of each x value.

Examples

julia> b = 10 * rand(100000);

julia> N, bin = histcountindices(b, 0:2:10)
([20082, 19971, 20049, 19908, 19990], [2, 3, 2, 2, 4, 2, 3, 3, 2, 4  …  1, 3, 3, 3, 3, 5, 2, 3, 3, 1])

source

NaNStatistics.histcounts! — Method

histcounts!(N, x, xedges::AbstractRange)

Simple 1D histogram; as histcounts, but in-place, adding counts to the first length(xedges)-1 elements of Array N.

Note that counts will be added to N, not overwrite N, allowing you to produce cumulative histograms. However, this means you will have to initialize N with zeros before first use.

source

NaNStatistics.histcounts! — Method

histcounts!(N, x, y, xedges::AbstractRange, yedges::AbstractRange)

Simple 2D histogram; as histcounts, but in-place, adding counts to the first length(xedges)-1 columns and the first length(yedges)-1 rows of N elements of Array N.

Note that counts will be added to N, not overwrite N, allowing you to produce cumulative histograms. However, this means you will have to initialize N with zeros before first use.

source

NaNStatistics.histcounts — Method

histcounts(x, xedges::AbstractRange; T=Int64, normalize=false)

A 1D histogram, ignoring NaNs: calculate the number of x values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

By default, the counts are returned as Int64s, though this can be changed by specifying an output type with the optional keyword argument T.

Examples

julia> b = 10 * rand(100000);

julia> histcounts(b, 0:1:10)
10-element Vector{Int64}:
 10054
  9987
  9851
  9971
  9832
 10033
 10250
 10039
  9950
 10033

source

NaNStatistics.histcounts — Method

histcounts(x, y, xedges::AbstractRange, yedges::AbstractRange; T=Int64, normalize=false)

A 2D histogram, ignoring NaNs: calculate the number of x, y pairs that fall into each square of a 2D grid of equally-spaced square bins with edges specified by xedges and yedges.

The resulting matrix N of counts is oriented with the lowest x and y bins in N[1,1], where the first (vertical / row) dimension of N corresponds to the y axis (with size(N,1) == length(yedges)-1) and the second (horizontal / column) dimension of N corresponds to the x axis (with size(N,2) == length(xedges)-1).

By default, the counts are returned as Int64s, though this can be changed by specifying an output type with the optional keyword argument T.

Examples

julia> x = y = 0.5:9.5;

julia> xedges = yedges = 0:10;

julia> N = histcounts(x,y,xedges,yedges)
10×10 Matrix{Int64}:
 1  0  0  0  0  0  0  0  0  0
 0  1  0  0  0  0  0  0  0  0
 0  0  1  0  0  0  0  0  0  0
 0  0  0  1  0  0  0  0  0  0
 0  0  0  0  1  0  0  0  0  0
 0  0  0  0  0  1  0  0  0  0
 0  0  0  0  0  0  1  0  0  0
 0  0  0  0  0  0  0  1  0  0
 0  0  0  0  0  0  0  0  1  0
 0  0  0  0  0  0  0  0  0  1

source

NaNStatistics.histkurtosis — Method

histkurtosis(counts, bincenters)

Estimate the excess kurtosis [1] of the data represented by a histogram, specified as counts in equally spaced bins centered at bincenters.

[1] We follow Distributions.jl in returning excess kurtosis rather than raw kurtosis. Excess kurtosis is defined as as kurtosis - 3, such that a Normal distribution has zero excess kurtosis.

Examples

julia> binedges = -10:0.01:10;

julia> counts = histcounts(randn(10000), binedges);

julia> bincenters = (binedges[1:end-1] + binedges[2:end])/2
-9.995:0.01:9.995
t
julia> histkurtosis(counts, bincenters)
0.028863400305099596

source

NaNStatistics.histmean — Method

histmean(counts, bincenters)

Estimate the mean of the data represented by a histogram, specified as counts in equally spaced bins centered at bincenters.

Examples

julia> binedges = -10:0.01:10;

julia> counts = histcounts(randn(10000), binedges);

julia> bincenters = (binedges[1:end-1] + binedges[2:end])/2
-9.995:0.01:9.995

julia> histmean(counts, bincenters)
0.0039890000000003135

source

NaNStatistics.histskewness — Method

histskewness(counts, bincenters)

Estimate the skewness of the data represented by a histogram, specified as counts in equally spaced bins centered at bincenters.

Examples

julia> binedges = -10:0.01:10;

julia> counts = histcounts(randn(10000), binedges);

julia> bincenters = (binedges[1:end-1] + binedges[2:end])/2
-9.995:0.01:9.995

julia> histskewness(counts, bincenters)
0.011075369240851738

source

NaNStatistics.histstd — Method

histstd(counts, bincenters; corrected::Bool=true)

Estimate the standard deviation of the data represented by a histogram, specified as counts in equally spaced bins centered at bincenters.

If counts have been normalized, or represent an analytical estimate of a PDF rather than a histogram representing counts of a dataset, Bessel's correction to the standard deviation should likely not be performed - i.e., set the corrected keyword argument to false.

Examples

julia> binedges = -10:0.01:10;

julia> counts = histcounts(randn(10000), binedges);

julia> bincenters = (binedges[1:end-1] + binedges[2:end])/2
-9.995:0.01:9.995
t
julia> histstd(counts, bincenters)
0.999592620230683

source

NaNStatistics.histvar — Method

histvar(counts, bincenters; corrected::Bool=true)

Estimate the variance of the data represented by a histogram, specified as counts in equally spaced bins centered at bincenters.

If counts have been normalized, or represent an analytical estimate of a PDF rather than a histogram representing counts of a dataset, Bessel's correction to the variance should likely not be performed - i.e., set the corrected keyword argument to false.

Examples

julia> binedges = -10:0.01:10;

julia> counts = histcounts(randn(10000), binedges);

julia> bincenters = (binedges[1:end-1] + binedges[2:end])/2
-9.995:0.01:9.995
t
julia> histvar(counts, bincenters)
0.9991854064196424

source

NaNStatistics.inpctile — Method

inpctile(A, p::Number; dims)

Return a boolean array that identifies which values of the iterable collection A fall within the central pth percentile, optionally along a dimension specified by dims.

A valid percentile value must satisfy 0 <= p <= 100.

source

NaNStatistics.movmean — Method

movmean(x::AbstractVector{T}, win::Tuple{Int, Int}=(1, 1); skip_centre=false) where {T<:Real}

Compute the simple moving average of a 1-dimensional array x over a window defined by win, returning an array of the same size as x.

Arguments

x::AbstractVector{T}: Input array of type T.
win::Tuple{Int64, Int64}: Tuple defining the window size to the left and right of each element. Default is (1, 1).
skip_centre::Bool: If true, the center element is skipped in the average calculation. Default is false.

Returns

Vector: An array of the same size as x containing the moving averages.

Example

x = [1, 2, 3, 4, 5]
win = (1, 1)
movmean(x, win)  # returns [1.5, 2.0, 3.0, 4.0, 4.5]

source

NaNStatistics.movmean — Method

movmean(x::AbstractVecOrMat, n::Number)

Simple moving average of x in 1 or 2 dimensions, spanning n bins (or n*n in 2D), returning an array of the same size as x.

For the resulting moving average to be symmetric, n must be an odd integer; if n is not an odd integer, the first odd integer greater than n will be used instead.

source

NaNStatistics.movsum — Method

movsum(x::AbstractVecOrMat, n::Number)

Simple moving sum of x in 1 or 2 dimensions, spanning n bins (or n*n in 2D), returning an array of the same size as x.

For the resulting moving sum to be symmetric, n must be an odd integer; if n is not an odd integer, the first odd integer greater than n will be used instead.

source

NaNStatistics.nanaad — Method

nanaad(A; dims, size_threshold=NANMEAN_SIZE_THRESHOLD)

Mean (average) absolute deviation from the mean, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims. Note that for a Normal distribution, sigma = 1.253 * AAD. The size_threshold argument is supported for taking the mean, see nanmean for more information.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanadd! — Method

nanadd!(A, B)

Add the non-NaN elements of B to A, treating NaNs as zeros

source

NaNStatistics.nanadd — Method

nanadd(A, B)

Add the non-NaN elements of A and B, treating NaNs as zeros

source

NaNStatistics.nanargmax — Method

nanargmax(A)

As argmax but ignoring NaNs: Find the index of the largest non-NaN value (if any) of an indexable collection A

source

NaNStatistics.nanargmin — Method

nanargmin(A)

As argmin but ignoring NaNs: Find the index of the smallest non-NaN value (if any) of an indexable collection A

source

NaNStatistics.nanbinmean! — Method

nanbinmean!(MU, N, x, y, z, xedges::AbstractRange, yedges::AbstractRange)

Ignoring NaNs, fill the matrix MU with the means and N with the counts of non-NAN z values that fall into a 2D grid of x and y bins defined by xedges and yedges. The independent variables x and y, as well as the dependent variable z, are all expected as 1D vectors (any subtype of AbstractVector).

The output matrices MU and N must be the same size, and must each have length(yedges)-1 rows and length(xedges)-1 columns.

source

NaNStatistics.nanbinmean! — Method

nanbinmean!(MU, [N], x, y, xedges::AbstractRange)

Ignoring NaNs, fill the array MU with the means (and optionally N with the counts) of non-NAN y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

The array of x data should given as a one-dimensional array (any subtype of AbstractVector) and y as either a 1-d or 2-d array (any subtype of AbstractVecOrMat).

The output arrays MU and N must be the same size, and must have the same number of columns as y; if y is a 2-d array (matrix), then each column of y will be treated as a separate variable.

source

NaNStatistics.nanbinmean — Method

nanbinmean(x, y, xedges::AbstractRange)

Ignoring NaNs, calculate the mean of y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

The array of x data should be given as a one-dimensional array (any subtype of AbstractVector) and y as either a 1-d or 2-d array (any subtype of AbstractVecOrMat). If y is a 2-d array, then each column of y will be treated as a separate variable.

Examples

julia> nanbinmean([1:100..., 1], [1:100..., NaN], 0:25:100)
4-element Vector{Float64}:
 13.0
 38.0
 63.0
 87.5

julia> nanbinmean(1:100, reshape(1:300,100,3), 0:25:100)
4×3 Matrix{Float64}:
 13.0  113.0  213.0
 38.0  138.0  238.0
 63.0  163.0  263.0
 87.5  187.5  287.5

source

NaNStatistics.nanbinmean — Method

nanbinmean(x, y, xedges::AbstractRange)

Ignoring NaNs, calculate the weighted mean of y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

The array of x data should given as a one-dimensional array (any subtype of AbstractVector) and y as either a 1-d or 2-d array (any subtype of AbstractVecOrMat). If y is a 2-d array, then each column of y will be treated as a separate variable.

source

NaNStatistics.nanbinmean — Method

nanbinmean(x, y, z, xedges, yedges)

Ignoring NaNs, calculate the mean of z values that fall into a 2D grid of x and y bins with bin edges defined by xedges and yedges. The independent variables x and y, as well as the dependent variable z, are all expected as 1D vectors (any subtype of AbstractVector).

Examples

julia> x = y = z = 0.5:9.5;

julia> xedges = yedges = 0:10;

julia> nanbinmean(x,y,z,xedges,yedges)
10×10 Matrix{Float64}:
   0.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
 NaN      1.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN
 NaN    NaN      2.5  NaN    NaN    NaN    NaN    NaN    NaN    NaN
 NaN    NaN    NaN      3.5  NaN    NaN    NaN    NaN    NaN    NaN
 NaN    NaN    NaN    NaN      4.5  NaN    NaN    NaN    NaN    NaN
 NaN    NaN    NaN    NaN    NaN      5.5  NaN    NaN    NaN    NaN
 NaN    NaN    NaN    NaN    NaN    NaN      6.5  NaN    NaN    NaN
 NaN    NaN    NaN    NaN    NaN    NaN    NaN      7.5  NaN    NaN
 NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN      8.5  NaN
 NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN      9.5

source

NaNStatistics.nanbinmedian! — Method

nanbinmedian!(M, [N], x, y, xedges::AbstractRange)

Fill the array M with the medians (and optionally N with the counts) of non-NaN y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

If y is a 2-d array (matrix), each column will be treated as a separate variable

source

NaNStatistics.nanbinmedian — Method

nanbinmedian(x, y, xedges::AbstractRange)

Calculate the median, ignoring NaNs, of y values that fall into each of length(xedges)-1 equally spaced bins along the x axis with bin edges specified by xedges.

If y is a 2-d array (matrix), each column will be treated as a separate variable

Examples

julia> nanbinmedian([1:100..., 1], [1:100..., NaN], 0:25:100)
4-element Vector{Float64}:
 12.5
 37.0
 62.0
 87.0

julia> nanbinmedian(1:100, reshape(1:300,100,3), 0:25:100)
4×3 Matrix{Float64}:
 12.5  112.5  212.5
 37.0  137.0  237.0
 62.0  162.0  262.0
 87.0  187.0  287.0

source

NaNStatistics.nancor — Method

nancor(X::AbstractMatrix; dims::Int=1)

Compute the (Pearson's product-moment) correlation matrix of the matrix X, along dimension dims. As Statistics.cor, but ignoring NaNs.

source

NaNStatistics.nancor — Method

nancor(x::AbstractVector, y::AbstractVector)

Compute the (Pearson's product-moment) correlation between the vectors x and y. As Statistics.cor, but ignoring NaNs.

Equivalent to nancov(x,y) / (nanstd(x) * nanstd(y)).

source

NaNStatistics.nancov — Method

nancov(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)

Compute the covariance matrix of the matrix X, along dimension dims. As Statistics.cov, but ignoring NaNs.

If corrected is true as is the default, Bessel's correction will be applied, such that the sum is scaled by n-1 rather than n, where n = length(x).

source

NaNStatistics.nancov — Method

nancov(x::AbstractVector, y::AbstractVector; corrected::Bool=true)

Compute the covariance between the vectors x and y. As Statistics.cov, but ignoring NaNs.

If corrected is true as is the default, Bessel's correction will be applied, such that the sum is scaled by n-1 rather than n, where n = length(x).

source

NaNStatistics.nancovem — Method

nancovem(X::AbstractMatrix; dims::Int=1, corrected::Bool=true)

Compute the covariance-of-error-of-the-mean matrix X, along dimension dims, ignoring NaNs. That is, a matrix composed of the covariance of the error of the mean of the non-nan pairs of each column (dims=1) or row (dims=2) of the matrix X, where covariance of the error of the mean is to covariance as standard error of the mean is to standard deviation.

If corrected is true as is the default, Bessel's correction will be applied, such that the sum is scaled by n-1 rather than n, where n = length(x).

source

NaNStatistics.nancovem — Method

nancovem(x::AbstractVector, y::AbstractVector; corrected::Bool=true)

Compute the covariance of the error of the mean of the non-nan pairs of x and y, where covariance of the error of the mean is to covariance as standard error of the mean is to standard deviation.

If corrected is true as is the default, Bessel's correction will be applied, to the covariance, such that the sum is scaled by n-1 rather than n, where n = length(x).

source

NaNStatistics.nancumsum — Method

nancumsum(A; dims)

Calculate the sum of an indexable collection A, ignoring NaNs, optionally along dimensions specified by dims.

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4; NaN NaN]
3×2 Matrix{Float64}:
   1.0    2.0
   3.0    4.0
 NaN    NaN

julia> nancumsum(A, dims=1)
3×2 Matrix{Float64}:
 1.0  2.0
 4.0  6.0
 4.0  6.0

julia> nancumsum(vec(A))
6-element Vector{Float64}:
  1.0
  4.0
  4.0
  6.0
 10.0
 10.0

source

NaNStatistics.nanextrema — Method

nanextrema(A; dims)

Find the extrema (maximum & minimum) of an indexable collection A, ignoring NaNs, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nankurtosis — Method

nankurtosis(A; dims=:, mean=nothing)

Compute the kurtosis of all non-NaN elements in A, optionally over dimensions specified by dims. As StatsBase.kurtosis, but ignoring NaNs.

A precomputed mean may optionally be provided, which results in a somewhat faster calculation. If corrected is true, then Bessel's correction is applied, such that the sum is divided by n-1 rather than n.

As an alternative to dims, nankurtosis also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2 3 ; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

julia> nankurtosis(A, dims=1)
1×3 Matrix{Float64}:
 -1.5  -1.5  -1.5

julia> nankurtosis(A, dims=2)
3-element Vector{Float64}:
 -1.5
 -1.5
 -1.5

source

NaNStatistics.nanlogcumsumexp — Method

nanlogcumsumexp(A)

Return the logarithm of the cumulative sum of exp.(A) – i.e., log.(cumsum.(exp.(A))), but ignoring NaNs and avoiding numerical over/underflow. As nancumsum, but operating on logarithms; as nanlogsumexp, but returning a array of cumulative sums, rather than a single value.

Examples

source

NaNStatistics.nanlogsumexp — Method

nanlogsumexp(A)

Return the logarithm of the sum of exp.(A) – i.e., log(sum(exp.(A))), but ignoring NaNs and avoiding numerical over/underflow. As nancumsum, but operating on logarithms; as nanlogsumexp, but returning a array of cumulative sums, rather than a single value.

Examples

source

NaNStatistics.nanmad! — Method

nanmad!(A; dims)

As nanmad but in-place.

source

NaNStatistics.nanmad — Method

nanmad(A; dims)

Median absolute deviation from the median, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims. Note that for a Normal distribution, sigma = 1.4826 * MAD.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanmask! — Method

nanmask!(mask, A)

Fill a Boolean mask of dimensions size(A) that is false wherever A is NaN

source

NaNStatistics.nanmask — Method

nanmask(A)

Create a Boolean mask of dimensions size(A) that is false wherever A is NaN

source

NaNStatistics.nanmax — Method

nanmax(a,b)

As max(a,b), but if either argument is NaN, return the other one

source

NaNStatistics.nanmaximum — Method

nanmaximum(A; dims)

As maximum but ignoring NaNs: Find the largest non-NaN value of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanmean! — Method

nanmean!(B, A; dims=:, dim=:, size_threshold=NANMEAN_SIZE_THRESHOLD)

Same as nanmean, except that the result will be written to the array B. If B cannot be reshaped to the right size then a DimensionMismatch exception will be thrown. The returned array may be a different size than B depending on whether dims/dim is used, but it will always alias B.

source

NaNStatistics.nanmean — Method

nanmean(A, W; dims)

Ignoring NaNs, calculate the weighted mean of an indexable collection A, optionally along dimensions specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanmean — Method

nanmean(A; dims, size_threshold)

Compute the mean of all non-NaN elements in A, optionally over dimensions specified by dims. As Statistics.mean, but ignoring NaNs.

As an alternative to dims, nanmean also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

nanmean has optimized implementations for big and small arrays for reducing over slow dimensions. Which implementation is used can be tuned by setting the size_threshold keyword argument to either an integer number of bytes, or the symbols :L1/:L2/:L3 to use a specific cache size. The Hwloc.jl package must be loaded to support setting the threshold by cache symbol. The default threshold is 1MB, but if Hwloc.jl is loaded then the default will be set to :L1 for the L1 cache size.

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> nanmean(A, dims=1)
1×2 Matrix{Float64}:
 2.0  3.0

julia> nanmean(A, dims=2)
2×1 Matrix{Float64}:
 1.5
 3.5

source

NaNStatistics.nanmedian! — Method

nanmedian!(A; dims)

Compute the median of all elements in A, optionally over dimensions specified by dims. As Statistics.median!, but ignoring NaNs and supporting the dims keyword.

Be aware that, like Statistics.median!, this function modifies A, sorting or partially sorting the contents thereof (specifically, along the dimensions specified by dims, using either quicksort! or quickselect! around the median depending on the size of the array). Do not use this function if you do not want the contents of A to be rearranged.

Reduction over multiple dims is not officially supported, though does work (in generally suboptimal time) as long as the dimensions being reduced over are all contiguous.

Optionally supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

 julia> nanmedian!(A, dims=1)
 1×3 Matrix{Float64}:
  4.0  5.0  6.0

 julia> nanmedian!(A, dims=2)
 3×1 Matrix{Float64}:
  2.0
  5.0
  8.0

 julia> nanmedian!(A)
 5.0

 julia> A # Note that the array has been sorted
3×3 Matrix{Int64}:
 1  4  7
 2  5  8
 3  6  9

source

NaNStatistics.nanmedian — Method

nanmedian(A; dims)

Calculate the median, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims.

Reduction over multiple dims is not officially supported, though does work (in generally suboptimal time) as long as the dimensions being reduced over are all contiguous.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanmin — Method

nanmin(a,b)

As min(a,b), but if either argument is NaN, return the other one

source

NaNStatistics.nanminimum — Method

nanminimum(A; dims)

As minimum but ignoring NaNs: Find the smallest non-NaN value of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanpctile! — Method

nanpctile!(A, p; dims)

Compute the pth percentile (where p ∈ [0,100]) of all elements in A, ignoring NaNs, optionally over dimensions specified by dims.

As StatsBase.percentile, but in-place, ignoring NaNs, and supporting the dims keyword.

Be aware that, like Statistics.median!, this function modifies A, sorting or partially sorting the contents thereof (specifically, along the dimensions specified by dims, using either quicksort! or quickselect! depending on the size of the array). Do not use this function if you do not want the contents of A to be rearranged.

Reduction over multiple dims is not officially supported, though does work (in generally suboptimal time) as long as the dimensions being reduced over are all contiguous.

Examples

julia> using NaNStatistics

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

 julia> nanpctile!(A, 50, dims=1)
 1×3 Matrix{Float64}:
  4.0  5.0  6.0

 julia> nanpctile!(A, 50, dims=2)
 3×1 Matrix{Float64}:
  2.0
  5.0
  8.0

 julia> nanpctile!(A, 50)
 5.0

 julia> A # Note that the array has been sorted
3×3 Matrix{Int64}:
 1  4  7
 2  5  8
 3  6  9

source

NaNStatistics.nanpctile — Method

nanpctile(A, p; dims)

Find the pth percentile (where 0 <= p <= 100) of an indexable collection A, ignoring NaNs, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

See also nanpctile! for a more efficient in-place variant.

source

NaNStatistics.nanquantile! — Method

nanquantile!(A, q; dims)

Compute the qth quantile (where q ∈ [0,1]) of all elements in A, ignoring NaNs, optionally over dimensions specified by dims.

Similar to StatsBase.quantile!, but ignoring NaNs, and supporting the dims keyword.

Be aware that, like StatsBase.quantile!, this function modifies A, sorting or partially sorting the contents thereof (specifically, along the dimensions specified by dims, using either quicksort! or quickselect! depending on the size of the array). Do not use this function if you do not want the contents of A to be rearranged.

Reduction over multiple dims is not officially supported, though does work (in generally suboptimal time) as long as the dimensions being reduced over are all contiguous.

Examples

julia> using NaNStatistics

julia> A = [1 2 3; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

 julia> nanquantile!(A, 0.5, dims=1)
 1×3 Matrix{Float64}:
  4.0  5.0  6.0

 julia> nanquantile!(A, 0.5, dims=2)
 3×1 Matrix{Float64}:
  2.0
  5.0
  8.0

 julia> nanquantile!(A, 0.5)
 5.0

 julia> A # Note that the array has been sorted
3×3 Matrix{Int64}:
 1  4  7
 2  5  8
 3  6  9

source

NaNStatistics.nanquantile — Method

nanquantile(A, q; dims)

Compute the qth quantile (where q ∈ [0,1]) of all elements in A, ignoring NaNs, optionally over dimensions specified by dims.

Reduction over multiple dims is not officially supported, though does work (in generally suboptimal time) as long as the dimensions being reduced over are all contiguous.

See also nanquantile! for a more efficient in-place variant.

source

NaNStatistics.nanrange — Method

nanrange(A; dims)

Calculate the range (maximum - minimum) of an indexable collection A, ignoring NaNs, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nansem — Method

nansem(A; dims=:, mean=nothing, corrected=true)

Compute the standard error of the mean of all non-NaN elements in A, optionally over dimensions specified by dims.

As an alternative to dims, nansem also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> nansem(A, dims=1)
1×2 Matrix{Float64}:
 2.0  2.0

julia> nansem(A, dims=2)
2×1 Matrix{Float64}:
 0.5
 0.5

source

NaNStatistics.nanskewness — Method

nanskewness(A; dims=:, mean=nothing)

Compute the skewness of all non-NaN elements in A, optionally over dimensions specified by dims. As StatsBase.skewness, but ignoring NaNs.

A precomputed mean may optionally be provided, which results in a somewhat faster calculation.

As an alternative to dims, nanskewness also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2 3 ; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9

julia> nanskewness(A, dims=1)
1×3 Matrix{Float64}:
 0.0  0.0  0.0


julia> nanskewness(A, dims=2)
3-element Vector{Float64}:
 0.0
 0.0
 0.0

source

NaNStatistics.nanstandardize! — Method

nanstandardize!(A::Array{<:AbstractFloat}; dims)

Rescale A to unit variance and zero mean i.e. A .= (A .- nanmean(A)) ./ nanstd(A)

source

NaNStatistics.nanstandardize — Method

nanstandardize(A; dims)

Rescale a copy of A to unit variance and zero mean i.e. (A .- nanmean(A)) ./ nanstd(A)

source

NaNStatistics.nanstd — Method

nanstd(A, W; dims)

Calculate the weighted standard deviation, ignoring NaNs, of an indexable collection A, optionally along a dimension specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

source

NaNStatistics.nanstd — Method

nanstd(A; dims=:, mean=nothing, corrected=true)

Compute the variance of all non-NaN elements in A, optionally over dimensions specified by dims. As Statistics.var, but ignoring NaNs.

As an alternative to dims, nanstd also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> nanstd(A, dims=1)
1×2 Matrix{Float64}:
 1.41421  1.41421

julia> nanstd(A, dims=2)
2×1 Matrix{Float64}:
 0.7071067811865476
 0.7071067811865476

source

NaNStatistics.nansum! — Method

nansum!(B, A; dims=:, dim=:)

Same as nansum, except that the result will be written to the array B. If B cannot be reshaped to the right size then a DimensionMismatch exception will be thrown. The returned array may be a different size than B depending on whether dims/dim is used, but it will always alias B.

source

NaNStatistics.nansum — Method

nansum(A; dims)

Calculate the sum of an indexable collection A, ignoring NaNs, optionally along dimensions specified by dims.

Also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> nansum(A, dims=1)
1×2 Matrix{Int64}:
 4  6

julia> nansum(A, dims=2)
2×1 Matrix{Int64}:
 3
 7

source

NaNStatistics.nanvar — Method

nanvar(A; dims=:, mean=nothing, corrected=true)

Compute the variance of all non-NaN elements in A, optionally over dimensions specified by dims. As Statistics.var, but ignoring NaNs.

As an alternative to dims, nanvar also supports the dim keyword, which behaves identically to dims, but also drops any singleton dimensions that have been reduced over (as is the convention in some other languages).

Examples

julia> using NaNStatistics

julia> A = [1 2; 3 4]
2×2 Matrix{Int64}:
 1  2
 3  4

julia> nanvar(A, dims=1)
1×2 Matrix{Float64}:
 2.0  2.0

julia> nanvar(A, dims=2)
2×1 Matrix{Float64}:
 0.5
 0.5

source

NaNStatistics.zeronan! — Method

zeronan!(A)

Replace all NaNs in A with zeros of the same type

source