mrpls              package:plsgenomics              R Documentation

_R_i_d_g_e _P_a_r_t_i_a_l _L_e_a_s_t _S_q_u_a_r_e _f_o_r _c_a_t_e_g_o_r_i_c_a_l _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     The function 'mrpls' performs prediction using Fort et al. (2005)
     MRPLS algorithm.

_U_s_a_g_e:

     mrpls(Ytrain,Xtrain,Lambda,ncomp,Xtest=NULL,NbIterMax=50)

_A_r_g_u_m_e_n_t_s:

  Xtrain: a (ntrain x p) data matrix of predictors. 'Xtrain' must be a
          matrix.  Each row corresponds to an observation and each
          column to a predictor variable.

  Ytrain: a ntrain vector of responses. 'Ytrain' must be a vector. 
          'Ytrain' is a {1,...,c+1}-valued vector and contains the
          response variable for each observation. c+1 is the number of
          classes.

   Xtest: a (ntest x p) matrix containing the predictors for the test
          data set. 'Xtest' may also be a vector of length p
          (corresponding to only one  test observation).If 'Xtest' is
          not equal to NULL, then the prediction  step is made for
          these new predictor variables.

  Lambda: a positive real value. 'Lambda' is the ridge regularization
          parameter.

   ncomp: a positive integer. 'ncomp' is the number of PLS components. 
          If 'ncomp'=0,then the Ridge regression is performed without
          reduction  dimension. 

NbIterMax: a positive integer. 'NbIterMax' is the maximal number of
          iterations in the  Newton-Rapson parts.

_D_e_t_a_i_l_s:

     The columns of the data matrices 'Xtrain' and 'Xtest' may not be
     standardized,  since standardizing is performed by the function
     'mrpls' as a preliminary step before the algorithm is run. 

     The procedure described in Fort et al. (2005) is used to determine
     latent components to be used for classification and when 'Xtest' 
     is not equal to NULL, the procedure predicts the labels for these
     new  predictor variables.

_V_a_l_u_e:

     A list with the following components: 

   Ytest: the ntest vector containing the predicted labels for the
          observations from  'Xtest'.

Coefficients: the (p+1) x c matrix containing the coefficients
          weighting the block  design matrix.

DeletedCol: the vector containing the column number of 'Xtrain' when
          the  variance of the corresponding predictor variable is
          null. Otherwise 'DeletedCol'=NULL

    hatY: If 'ncomp' is greater than 1, 'hatY' is a matrix of size
          ntest x ncomp  in such a way that the kth column corresponds
          to the predicted label obtained with k PLS components.

_A_u_t_h_o_r(_s):

     Sophie Lambert-Lacroix (<URL:
     http://www-lmc.imag.fr/lmc-sms/Sophie.Lambert>).

_R_e_f_e_r_e_n_c_e_s:

     G. Fort, S. Lambert-Lacroix and Julie Peyre (2005). Rduction de
     dimension dans les modles  linaires gnraliss : application 
     la classification supervise de donnes issues des biopuces.
     Journal de la SFDS, tome 146, n1-2, 117-152.

_S_e_e _A_l_s_o:

     'mrpls.cv', 'rpls', 'rpls.cv'.

_E_x_a_m_p_l_e_s:

     # load plsgenomics library
     library(plsgenomics)

     # load SRBCT data
     data(SRBCT)
     IndexLearn <- c(sample(which(SRBCT$Y==1),10),sample(which(SRBCT$Y==2),4),sample(which(SRBCT$Y==3),7),sample(which(SRBCT$Y==4),9))

     # perform prediction by MRPLS
     res <- mrpls(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],Lambda=0.001,ncomp=2,Xtest=SRBCT$X[-IndexLearn,])
     sum(res$Ytest!=SRBCT$Y[-IndexLearn])

     # prediction for another sample
     Xnew <- SRBCT$X[83,]
     # Compute the linear predictor for each classes expect class 1
     eta <- diag(t(cbind(c(1,Xnew),c(1,Xnew),c(1,Xnew))) %*% res$Coefficients)
     Ypred <- which.max(c(0,eta))
     Ypred
     SRBCT$Y[83]

