| ksample.e {energy} | R Documentation |
Returns the E-statistic (energy statistic) for the multivariate k-sample test of equal distributions.
ksample.e(x, sizes, distance = FALSE, ix = 1:sum(sizes))
x |
data matrix of pooled sample |
sizes |
vector of sample sizes |
distance |
logical: if TRUE, x is a distance matrix |
ix |
a permutation of the row indices of x |
The k-sample multivariate E-statistic for testing equal distributions
is returned. The statistic is computed from the original pooled samples, stacked in
matrix x where each row is a multivariate observation, or from the distance
matrix x of the original data. The
first sizes[1] rows of x are the first sample, the next
sizes[2] rows of x are the second sample, etc.
The two-sample E-statistic proposed by Szekely and Rizzo (2004) is the e-distance e(S_i,S_j), defined for two samples S_i, S_j of size n_i, n_j by
e(S_i, S_j) = (n_i n_j)(n_i+n_j)[2M_(ij)-M_(ii)-M_(jj)],
where
M_{ij} = 1/(n_i n_j) sum[1:n_i, 1:n_j] ||X_(ip) - X_(jq)||,
|| || denotes Euclidean norm, and X_(ip) denotes the p-th observation in the i-th sample. The k-sample E-statistic is defined by summing the pairwise e-distances over all k(k-1)/2 pairs of samples:
E = sum[i<j] e(S_i,S_j).
Large values of E are significant.
The value of the multisample E-statistic corresponding to
the permutation ix is returned.
The pairwise e-distances between samples can be conveniently
computed by the edist function, which returns a dist object.
The function ksample.e computes the E-statistic only.
For the test decision, a nonparametric bootstrap test (approximate permutation test)
is provided by the function eqdist.etest. With the default arguments,
ksample.e computes the statistic without storing the distance matrix.
For the test statistic only, ksample.e is usually faster than calling
eqdist.e, but for a permutation test the method of calculation in
eqdist.etest computes the replicates much faster.
Maria L. Rizzo mrizzo @ bgnet.bgsu.edu and Gabor J. Szekely gabors @ bgnet.bgsu.edu
Szekely, G. J. and Rizzo, M. L. (2004) Testing for Equal Distributions in High Dimension, InterStat, November (5).
Szekely, G. J. (2000) Technical Report 03-05: E-statistics: Energy of Statistical Samples, Department of Mathematics and Statistics, Bowling Green State University.
eqdist.etest
edist
energy.hclust
## compute 3-sample E-statistic for 4-dimensional iris data data(iris) ksample.e(iris[,1:4], c(50,50,50)) ## compute a 3-sample univariate E-statistic ksample.e(rnorm(150), c(25,75,50))