STATISTICS command
Syntax: |
STATISTICS x { s1\keyword { s2\keyword ... }}
|
Qualifiers: | \MESSAGES, \WEIGHTS, \MOMENTS, \PEARSON |
Defaults: | \MESSAGES, \-WEIGHTS |
Examples: |
STATISTICS X
|
The STATISTICS
command calculates various statistics
for the input variable x
, which can be
a vector or a matrix. Specific statistics are chosen with qualifier keywords
which are appended to the output parameters with the backslash, \. All
vectors must be the same size.
Table 1 below shows the parameter qualifier keywords and corresponding output values for extrema. Table 2 shows the parameter qualifier keywords and corresponding output values for central measures. Table 3 shows the parameter qualifier keywords and corresponding output values for dispersion and skewness.
Keyword | Output Value |
\MAX |
maximum value of x |
\IMAX |
index of the maximum if x is a vectorrow index of the maximum if x is a matrix |
\JMAX |
column index of the maximum if x is a matrix |
\MIN |
minimum value of x |
\IMIN |
index of the minimum if x is a vectorrow index of the minimum if x is a matrix |
\JMIN |
column index of the minimum value if x is a matrix |
Table 1: Extrema keywords |
Keyword | Output Value |
\SUM | arithmetic sum (unweighted) |
\MEAN | arithmetic mean |
\GMEAN | geometric mean |
\MEDIAN | median value |
\RMS | root-mean-square |
Table 2: Central measure keywords |
Keyword | Output Value |
\VARIANCE | variance |
\SDEV | standard deviation |
\ADEV | average deviation |
\KURTOSIS | kurtosis |
\SKEWNESS | skewness |
Table 3: Dispersion and skewness keywords |
Informational messages
The default is to display all the calculated statistics. If the
\-MESSAGES
command qualifier is used, and if at least one output scalar is entered,
then the statistics values will not be displayed.
Weights
Syntax: |
STATISTICS\WEIGHTS w x { s1\keyword { s2\keyword ... }}
|
You must use the \WEIGHTS
qualifier to indicate that a weight vector is present. Weights cannot be
applied to matrix data.
A weighting factor, w[i] ≥ 0
,
could be the frequency, the probability, the mass, the reliability, or some
other multiplier. The lengths of w
and x
must be equal.
Definitions
Suppose that x
is a vector with N
elements.
If a weight vector, w
, is entered, remember to use the
\WEIGHTS
command qualifier. The
length of w
is assumed to also be N
. If no weights are entered,
let wi
default to 1
, for i = 1,2,...,N
.
Define the total weight: W = w1 + w2 + ... + wN
Sum
The sum is defined by x1 + x2 + ... + xN
Mean value
The mean value, M
, is defined by
M = (1/W)*[w1x1 +
w2x2 + ... + wNxN]
Geometric mean
The geometric mean, Gx
, is defined if each xi ≥ 0
by:
Gx = exp(1/W)*[w1log(x1) +
w2log(x2) + ... +
wNlog(xN)]
Median
The median is the element of x
which has equal numbers of values above
it and below it. If N
is even, the median is the average of the unique
two central values.
Root-mean-square
The root-mean-square, RMS
, is defined by
RMS = sqrt([1/W]*[w1x12 +
w2x22
+ ... + wNxN2])
Variance
The variance, μ
, is defined by
μ = [N/W(N-1)]*[w1(x1-M)2 +
w2(x2-M)2 + ... +
wN(xN-M)2]
Standard deviation
The standard deviation, σ
, is defined by σ = sqrt(μ)
Average deviation
The average deviation, or mean deviation, δ
, is defined by
δ = (1/W)*[w1|x1-M| + w2|x2-M| + ... +
wN|xN-M|]
Skewness
The skewness, or third moment, skew
, is a nondimensional quantity that
characterizes the degree of asymmetry of a distribution around its mean. The
skewness is a pure number that characterizes only the shape of the
distribution, and is defined by
skew = (1/W)*{w1[(x1-M)/σ]3 +
w2[(x2-M)/σ]3 + ... +
wN[(xN-M)/σ]3}
A positive value of skewness signifies a distribution with an asymmetric tail extending out towards more positive x; a negative value signifies a distribution whose tail extends out towards more negative x.
Kurtosis
The kurtosis, kurt
, is a nondimensional quantity which measures the
relative peakedness or flatness of a distribution, relative to a normal
distribution. A distribution with positive kurtosis is termed leptokurtic;
a distribution with negative kurtosis is termed platykurtic. An in-between
distribution is termed mesokurtic. The kurtosis is defined by
kurt =
w1[(x1-M)/σ]4 +
w2[(x2-M)/σ]4 + ... +
wN[(xN-M)/σ]4 - 3
where the -3 term makes the value zero for a normal distribution.
Moments
Syntax: |
STATISTICS\MOMENTS w x n { s }
|
If the \MOMENTS
command qualifier is used, the n
th
moment of vector x
, with weight w
, is calculated and optionally
stored in output scalar s
. The moment number, n
, can be any integer
> 0
.
s = (1/W)*[w1x1n +
w2x2n + ... +
wNxNn]
Linear correlation coefficient
Syntax: |
STATISTICS\PEARSON x y { r p }
|
Pearson's r
, or the linear correlation coefficient, is widely used as
a measure of association between variables that are continuous. For pairs
of quantities (xi,yi)
, for i = 1,2,...,N
, the
linear correlation coefficient r
is given by the formula:
where is the mean of
x
, and
is the mean of
y
.
The value of r lies between -1 and +1, inclusive. It
takes on a value of +1 when the data points lie on a straight line
with positive slope, x
and y
increase together. The value
+1 holds independent of the magnitude of this slope. If the data
points lie on a straight line with negative slope, y
decreases as
x
increases, then r
has the value -1. A value of
r
near zero indicates that the variables x
and y
are
uncorrelated.
r
is a way of summarizing the strength of a correlation which is
known to be significant, but it is a poor statistic for deciding whether an
observed correlation is statistically significant, and/or whether one observed
correlation is significantly stronger than another. The reason is that
r
is ignorant of the individual distributions of x
and
y
, so there is no universal way to compute its distribution in the
case of the null hypothesis.
The STATISTICS\PEARSON
command returns Pearson's r
in the scalar variable
r
. It also returns scalar p
, the significance
level at which the null hypothesis of zero correlation is disproved.
A small value of p
indicates a significant correlation.
where I
is the incomplete Beta function and t
is defined by:
Examples
Suppose you have a vector X=[1.2;2.1;3.2;4.5;5;6;7]
. Entering
STATISTICS X
produces the following display:
If you want to use the values for the maximum, minimum and mean of X, enter:
STATISTICS X XMEAN\MEAN XMIN\MIN XMAX\MAX
and you will have the scalars: XMAX=7
, XMIN=1.2
, and
XMEAN=4.142857
If you also want the index values for the maximum and the minimum of X, enter:
STATISTICS X XMEAN\MEAN XMIN\MIN XMAX\MAX IMX\IMAX IMN\IMIN
and you will also have scalars: IMX=7
and IMN=1
.