| score {supclust} | R Documentation |
For a set of n observations grouped into two classes (for
example n expression values of a gene), the score
function measures the separation of the classes. It can be interpreted
as counting for each observation having response zero, the number of
individuals of response class one that are smaller, and summing up
these quantities.
score(x, resp)
x |
Numeric vector of length n, for example containing gene or cluster expression values of n different cases. |
resp |
Numeric vector of length n containing the ``binary''
class labels of the cases. Must be coded by 0 and 1. |
A numeric value, the score. The minimal score is
zero, the maximal score is the product of the number of samples
in class 0 and class 1. Values near the minimal or maximal
score indicate good separation, whereas intermediate
score means poor separation.
Marcel Dettling, dettling@stat.math.ethz.ch
Marcel Dettling (2002) Supervised Clustering of Genes, see http://stat.ethz.ch/~dettling/supercluster.html
Marcel Dettling and Peter Bühlmann (2002). Supervised Clustering of Genes. Genome Biology, 3(12): research0069.1-0069.15.
wilma, margin is the second statistic
that is used there.
data(leukemia, package="supclust")
op <- par(mfrow=c(1,3))
plot(leukemia.x[,69],leukemia.y)
title(paste("Score = ", score(leukemia.x[,69], leukemia.y)))
## Sign-flipping is very important
plot(leukemia.x[,161],leukemia.y)
title(paste("Score = ", score(leukemia.x[,161], leukemia.y),2))
x <- sign.flip(leukemia.x, leukemia.y)$flipped.matrix
plot(x[,161],leukemia.y)
title(paste("Score = ", score(x[,161], leukemia.y),2))
par(op)