smda                package:sparseLDA                R Documentation

_S_p_a_r_s_e _m_i_x_t_u_r_e _d_i_s_c_r_i_m_i_n_a_n_t _a_n_a_l_y_s_i_s

_D_e_s_c_r_i_p_t_i_o_n:

     Performs sparse linear discriminant analysis for mixture of
     gaussians models.

_U_s_a_g_e:

     smda(x, ...)

     ## Default S3 method:
     smda(x, y, Z = NULL, Rj = NULL, 
          lambda = 1e-6, stop, maxIte = 50, 
          trace = FALSE, tol = 1e-4, ...)

_A_r_g_u_m_e_n_t_s:

       x: A matrix of the training data with observations down the rows
          and variables in the columns.

       y: A matrix initializing the dummy variables representing the
          groups.

       Z: Am optional matrix initializing the probabilities
          representing the groups.

      Rj: K length vector containing the number of subclasses in each
          of the K classes.

  lambda: The weight on the L2-norm for elastic net regression.
          Default: 1e-6.

    stop: If STOP is negative, its absolute value corresponds to the
          desired number of variables. If STOP is positive, it
          corresponds to an upper bound on the L1-norm of the b
          coefficients. There is a one to one correspondence between
          stop and t.

  maxIte: Maximum number of iterations. Default: 50.

   trace: If TRUE, prints out its progress. Default: FALSE.

     tol: Tolerance for the stopping criterion (change in RSS).
          Default: 1e-4

     ...: additional arguments

_D_e_t_a_i_l_s:

     The function finds sparse directions for linear classification of
     mixture og gaussians models.

_V_a_l_u_e:

     Returns a list with the following attributes: 

    call: The call

    beta: The loadings of the sparse discriminative directions.

   theta: The optimal scores.

       Z: Updated subclass probabilities.

      Rj: a vector of the number of ssubclasses per class

     rss: A vector of the Residual Sum of Squares at each iteration.

_A_u_t_h_o_r(_s):

     Line Clemmensen

_R_e_f_e_r_e_n_c_e_s:

     Clemmensen, L., Hastie, T. and Ersboell, K. (2007) "Sparse
     discriminant analysis", Technical report, IMM, Technical
     University of Denmark

_S_e_e _A_l_s_o:

     'normalize', 'normalizetest', 'sda'

_E_x_a_m_p_l_e_s:

     # load data
     data(penicilliumYES)
     X <- penicilliumYES$X
     Y <- penicilliumYES$Y
     Z <- penicilliumYES$Z

     ## test samples
     Iout <- c(3, 6, 9, 12)
     Iout <- c(Iout, Iout+12, Iout+24)

     ## training data
     Xtr <- X[-Iout,]
     k <- 3
     n <- dim(Xtr)[1]
     Rj <- rep(4, 3)

     ## Normalize data
     Xc <- normalize(Xtr)
     Xn <- Xc$Xc
     p <- dim(Xn)[2]

     ## perform SMDA with one non-zero loading for each discriminative
     ## direction
     smdaFit <- smda(x = Xn,
                     y = Y, 
                     Z = Z, 
                     Rj = Rj,
                     lambda = 1e-6,
                     stop = -5,
                     maxIte = 10,
                     trace = TRUE,
                     tol = 1e-2)

     # testing
     Xtst <- X[Iout,]
     Xtst <- normalizetest(Xtst, Xc)

     test <- predict(smdaFit, Xtst)

