dlda                package:supclust                R Documentation

_C_l_a_s_s_i_f_i_c_a_t_i_o_n _w_i_t_h _W_i_l_m_a'_s _C_l_u_s_t_e_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     The four functions 'nnr' (nearest neighbor rule), 'dlda' (diagonal
     linear discriminant analysis), 'logreg' (logistic regression) and
     'aggtrees' (aggregated trees) are used for binary classification
     with the cluster representatives of Wilma's output.

_U_s_a_g_e:

     dlda(xlearn, xtest, ylearn)
     nnr(xlearn, xtest, ylearn)
     logreg(xlearn, xtest, ylearn)
     aggtrees(xlearn, xtest, ylearn)

_A_r_g_u_m_e_n_t_s:

  xlearn: Numeric matrix of explanatory variables (q variables in
          columns, n cases in rows), containing the learning or
          training data. Typically, these are the (gene) cluster
          representatives of Wilma's output.

   xtest: A numeric matrix of explanatory variables (q variables in
          columns, m cases in rows), containing the test or validation
          data. Typically, these are the fitted (gene) cluster
          representatives of Wilma's output for the training data,
          obtained from 'predict.wilma'.

  ylearn: Numeric vector of length n containing the class labels for
          the training observations. These labels have to be coded by 0
          and 1.

_D_e_t_a_i_l_s:

     'nnr' implements the 1-nearest-neighbor-rule with Euclidean
     distance function. 'dlda' is linear discriminant analysis, using
     the restriction that the covariance matrix is diagonal with equal
     variance for all predictors. 'logreg' is default logistic
     regression. 'aggtrees' fits a default stump (a classification tree
     with two terminal nodes) by 'rpart' for every predictor variable
     and uses majority voting to determine the final classifier.

_V_a_l_u_e:

     Numeric vector of length m, containing the predicted class labels
     for the test observations. The class labels are coded by 0 and 1.

_A_u_t_h_o_r(_s):

     Marcel Dettling, dettling@stat.math.ethz.ch

_R_e_f_e_r_e_n_c_e_s:

     Marcel Dettling (2002) _Supervised Clustering of Genes_, see <URL:
     http://stat.ethz.ch/~dettling/supercluster.html>

     Marcel Dettling and Peter Bhlmann (2002). Supervised Clustering
     of Genes. _Genome Biology_, *3*(12): research0069.1-0069.15.

_S_e_e _A_l_s_o:

     'wilma'

_E_x_a_m_p_l_e_s:

     ## Generating random learning data: 20 observations and 10 variables (clusters)
     set.seed(342)
     xlearn <- matrix(rnorm(200), nrow = 20, ncol = 10)

     ## Generating random test data: 8 observations and 10 variables(clusters)
     xtest  <- matrix(rnorm(80),  nrow = 8,  ncol = 10)

     ## Generating random class labels for the learning data
     ylearn <- as.numeric(runif(20)>0.5)

     ## Predicting the class labels for the test data
     nnr(xlearn, xtest, ylearn)
     dlda(xlearn, xtest, ylearn)
     logreg(xlearn, xtest, ylearn)
     aggtrees(xlearn, xtest, ylearn)

