scipy.stats.mstats.mquantiles#

scipy.stats.mstats.mquantiles(a, prob=[0.25, 0.5, 0.75], alphap=0.4, betap=0.4, axis=None, limit=())[source]#

Computes empirical quantiles for a data array.

Samples quantile are defined by Q(p) = (1-gamma)*x[j] + gamma*x[j+1], where x[j] is the j-th order statistic, and gamma is a function of j = floor(n*p + m), m = alphap + p*(1 - alphap - betap) and g = n*p + m - j.

Reinterpreting the above equations to compare to R lead to the equation: p(k) = (k - alphap)/(n + 1 - alphap - betap)

Typical values of (alphap,betap) are:
  • (0,1) : p(k) = k/n : linear interpolation of cdf (R type 4)

  • (.5,.5) : p(k) = (k - 1/2.)/n : piecewise linear function (R type 5)

  • (0,0) : p(k) = k/(n+1) : (R type 6)

  • (1,1) : p(k) = (k-1)/(n-1): p(k) = mode[F(x[k])]. (R type 7, R default)

  • (1/3,1/3): p(k) = (k-1/3)/(n+1/3): Then p(k) ~ median[F(x[k])]. The resulting quantile estimates are approximately median-unbiased regardless of the distribution of x. (R type 8)

  • (3/8,3/8): p(k) = (k-3/8)/(n+1/4): Blom. The resulting quantile estimates are approximately unbiased if x is normally distributed (R type 9)

  • (.4,.4) : approximately quantile unbiased (Cunnane)

  • (.35,.35): APL, used with PWM

Parameters:
aarray_like

Input data, as a sequence or array of dimension at most 2.

probarray_like, optional

List of quantiles to compute.

alphapfloat, optional

Plotting positions parameter, default is 0.4.

betapfloat, optional

Plotting positions parameter, default is 0.4.

axisint, optional

Axis along which to perform the trimming. If None (default), the input array is first flattened.

limittuple, optional

Tuple of (lower, upper) values. Values of a outside this open interval are ignored.

Returns:
mquantilesMaskedArray

An array containing the calculated quantiles.

Notes

This formulation is very similar to R except the calculation of m from alphap and betap, where in R m is defined with each type.

References

Examples

>>> import numpy as np
>>> from scipy.stats.mstats import mquantiles
>>> a = np.array([6., 47., 49., 15., 42., 41., 7., 39., 43., 40., 36.])
>>> mquantiles(a)
array([ 19.2,  40. ,  42.8])

Using a 2D array, specifying axis and limit.

>>> data = np.array([[   6.,    7.,    1.],
...                  [  47.,   15.,    2.],
...                  [  49.,   36.,    3.],
...                  [  15.,   39.,    4.],
...                  [  42.,   40., -999.],
...                  [  41.,   41., -999.],
...                  [   7., -999., -999.],
...                  [  39., -999., -999.],
...                  [  43., -999., -999.],
...                  [  40., -999., -999.],
...                  [  36., -999., -999.]])
>>> print(mquantiles(data, axis=0, limit=(0, 50)))
[[19.2  14.6   1.45]
 [40.   37.5   2.5 ]
 [42.8  40.05  3.55]]
>>> data[:, 2] = -999.
>>> print(mquantiles(data, axis=0, limit=(0, 50)))
[[19.200000000000003 14.6 --]
 [40.0 37.5 --]
 [42.800000000000004 40.05 --]]