plotTrainTest            package:hddplot            R Documentation

_P_l_o_t _p_r_e_d_i_c_t_i_o_n_s _f_o_r _b_o_t_h _a _I/_I_I _t_r_a_i_n/_t_e_s_t _s_p_l_i_t, _a_n_d _t_h_e _r_e_v_e_r_s_e

_D_e_s_c_r_i_p_t_i_o_n:

     A division of data is specified, for use of linear discriminant
     analysis, into a training and test set. Feature selection and
     model fitting is  formed, first with I/II as training/test, then
     with II/I as training/test. Two graphs are plotted - for the I
     (training) /II (test) scores, and for the II/I scores.

_U_s_a_g_e:

     plotTrainTest(x = Golub.BM, nfeatures = c(11, 11), cl = cancer.BM, traintest = divideUp(cancer.BM), titles = c("A: I/II (train with I, scores are for II)", "B: II/I (train with II, scores are for I)"))

_A_r_g_u_m_e_n_t_s:

       x: Matrix; rows are features, and columns are observations 
          ('samples')

nfeatures: integer: numbers of features for which calculations are
          required

      cl: Factor that classifies columns into groups that will classify
          the data for purposes of discriminant calculations

traintest: Values that specify a division of observations into two
          groups. In the first pass (fold), one to be training and the
          other test, with the roles then reversed in a second pass or
          fold.

  titles: A character vector of length 2 giving titles for the two
          graphs

_D_e_t_a_i_l_s:

_V_a_l_u_e:

     Two graphs are plotted.

_N_o_t_e:

_A_u_t_h_o_r(_s):

     John Maindonald

_R_e_f_e_r_e_n_c_e_s:

_S_e_e _A_l_s_o:

_E_x_a_m_p_l_e_s:

     mat <- matrix(rnorm(1000), ncol=20)
     cl <- factor(rep(1:3, c(7,9,4)))
     gp.id <- divideUp(cl, nset=2)
     plotTrainTest(x=mat, cl=cl, traintest=gp.id, nfeatures=c(2,3))


     ## The function is currently defined as
     function(x=Golub.BM, nfeatures=c(11,11), cl=cancer.BM,
                traintest=divideUp(cancer.BM),
                titles=c("A: I/II (train with I, scores are for II)",
                  "B: II/I (train with II, scores are for I)")){
         oldpar <- par(mfrow=c(1,2), pty="s")
         on.exit(par(oldpar))
         if(length(nfeatures)==1)nfeatures <- rep(nfeatures,2)
         traintest <- factor(traintest)
         train <- traintest==levels(traintest)[1]
         testset <- traintest==levels(traintest)[2]
         cl1 <- cl[train]
         cl2 <- cl[testset]
         nf1 <- nfeatures[1]
         ord1 <- orderFeatures(x, cl, subset=train)
         df1 <- data.frame(t(x[ord1[1:nf1], train]))
         df2 <- data.frame(t(x[ord1[1:nf1], testset]))
         df1.lda <- lda(df1, cl1)    
         scores <- predict(df1.lda, newdata=df2)$x
         scoreplot(scorelist=list(scores=scores, cl=cl2,
                  nfeatures=nfeatures[1], other=NULL, cl.other=NULL),
                prefix.title="")
         mtext(side=3, line=2, titles[1], adj=0)
         nf2 <- nfeatures[2]
         ord2 <- orderFeatures(x, cl, subset=testset)
         df2 <- data.frame(t(x[ord2[1:nf2], testset]))
         df1 <- data.frame(t(x[ord2[1:nf2], train]))
         df2.lda <- lda(df2, cl2)    
         scores <- predict(df2.lda, newdata=df1)$x
         scoreplot(scorelist=list(scores=scores, cl=cl1,
                  nfeatures=nfeatures[2], other=NULL, cl.other=NULL),
                prefix.title="")
         mtext(side=3, line=2, titles[2], adj=0)    
       }

