| shrinkcat.stat {st} | R Documentation |
shrinkcat.stat and shrinkcat.fun computes a shrinkage
estimate of the ``correlation-adjusted t score''
of Zuber and Strimmer (2009).
shrinkcat.stat(X, L, verbose=TRUE) shrinkcat.fun(L, verbose=TRUE)
X |
data matrix. Note that the columns correspond to variables (``genes'') and the rows to samples. |
L |
vector with class labels for the two groups. |
verbose |
print out some (more or less useful) information during computation. |
The cat (``correlation-adjusted t'') score is the product of the square root of the
inverse correlation matrix with a vector of t scores. In Zuber and Strimmer (2009)
it is shown that the cat score is
a natural criterion to rank genes according to their ability to seperate two classes
in the presence of correlation among genes.
If there is no correlation, the cat score reduces to the usual t score
(hence in this case the estimate from shrinkcat.stat equals that from shrinkt.stat).
shrinkcat.stat returns a vector containing a shrinkage estimate of the
``cat score'' for each variable/gene.
The corresponding shrinkcat.fun functions return a function that
computes the cat score when applied to a data matrix
(this is very useful for simulations).
Verena Zuber and Korbinian Strimmer (http://strimmerlab.org).
Zuber, V., and K. Strimmer. 2009. Gene ranking and biomarker discovery under correlation. See http://arxiv.org/abs/0902.0751 for publication details.
# load st library
library("st")
# load full Khan et al (2001) data set
data(khan2001)
# create data set containing only the RMS and EWS samples
idx = which( khan2001$y == "RMS" | khan2001$y == "EWS")
X = khan2001$x[idx,]
L = factor(khan2001$y[idx])
dim(X)
L
# shrinkage cat statistic
score = shrinkcat.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# [1] 1389 1955 509 1003 246 187 2050 2046 545 1954
# compute q-values and local false discovery rates
library("fdrtool")
fdr.out = fdrtool(as.vector(score))
sum(fdr.out$qval < 0.05)
sum(fdr.out$lfdr < 0.2)
# compared with:
# shrinkage t statistic
score = shrinkt.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# [1] 1389 1955 187 246 1003 2046 2050 509 545 1799
# student t statistic
score = studentt.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# 1] 1389 1955 1003 187 246 2050 2046 1799 509 545
# difference of means ("Fold Change")
score = diffmean.stat(X, L)
idx = order(abs(score), decreasing=TRUE)
idx[1:10]
# [1] 509 187 1372 1955 246 1954 430 1645 545 129