plus                  package:plus                  R Documentation

_F_i_t_s _l_i_n_e_a_r _r_e_g_r_e_s_s_i_o_n _w_i_t_h _a _q_u_a_d_r_a_t_i_c _s_p_l_i_n_e _p_e_n_a_l_t_y, _i_n_c_l_u_d_i_n_g _t_h_e _L_a_s_s_o, _M_C+ _a_n_d _S_C_A_D.

_D_e_s_c_r_i_p_t_i_o_n:

     The algorithm generates a piecewise linear path of coefficients
     and penalty levels as critical points of a penalized loss in
     linear regression, starting with zero coefficients for infinity
     penalty and ending with a least squares fit for zero penalty. It
     is an extension of the LARS algorithm from the absolute value
     penalty to quadratic spline penalties.

_U_s_a_g_e:

     plus(x,y, method = c("lasso", "mc+", "scad", "general"), m=2, gamma,v,t,
        monitor=FALSE, normalize = TRUE, intercept = TRUE,
        Gram, use.Gram = FALSE, eps=1e-15, max.steps=500, lam)

_A_r_g_u_m_e_n_t_s:

       x: predictors, an n by p matrix with n > 1 and p > 1.  

       y: response, an n-vector with n > 1. 

  method: c("lasso", "mc+", "scad", "general"); the LASSO penalty is
          specified by m = 1, MC+ is specified by m = 2 and gamma > 0,
          SCAD by m = 3 and gamma > 1.  A general quadratic penalty is
          specified by m-vectors v and t. 

       m: number of knots with a quadratic spline penalty: m = 1 for
          Lasso, m = 2 for MC+, m = 3 for SCAD. Default is m = 2.  

   gamma: the largest knot of a quadratic spline penalty, say rho(.);
          gamma = 0 for lasso.  

       v: m-vector giving the negative second derivative rho(.) of the
          penalty between two knots or beyond gamma.  

       t: m-vector giving the discontinuities of the derivatives of the
          penalty function rho(.) as knots, including 0 as a knot.  

 monitor: If TRUE, plus prints out its progress when variables move in
          and out of the active set. Default is FALSE.  

normalize: If TRUE, each variable is standardized to have unit mean
          squares, otherwise it is left alone. Default is TRUE.  

intercept: If TRUE, an intercept is included in the model (and not
          penalized), otherwise no intercept is included. Default is
          TRUE.  

    Gram: The X'X matrix; useful for repeated runs (e.g. bootstrap)
          where a large X'X stays the same.  

use.Gram: When p is very large, you may not want PLUS to precompute the
          entire Gram matrix. Default is FALSE.  

     eps: An effective zero.  

max.steps: Limit the number of steps taken. Default is 500. There can
          be many more steps than n or p since variables can be removed
          and added as the algorithm proceeds. Users should check if
          the desired penalty level is reached if PLUS ends in the
          maximum step.  

     lam: A decreasing sequence of nonnegative numbers as penalty
          levels for which penalized estimates of coefficients are
          generated. Default is the vector of ordered penalty levels at
          the turning points of the computed path. If lam is set, the
          computation stops when the path first hits the minimum of
          lam. The scale of lam is determined by the penalized loss
          sum((y - x  

_D_e_t_a_i_l_s:

     PLUS is described in detail in Zhang (2007). It computes a
     complete path of crititcal points of a  penalised squared loss
     emcompassing from zero for infinite penalty to a lease squares fit
     for zero  penalty, including possible multiple local minima for
     each penalty level.

_V_a_l_u_e:

     A "plus" object is returned, for which print, predict, coef and
     plot methods exist. In addition  to arguments x, y, max.steps, and
     the used values of method, gamma and lam, the object  contains the
     following items: 

     Some significant components of the object are: 

       v: matrix with rows as p-vectors indicating the parallelepipeds
          in which the computed path lives

beta.path: Tmatrix with rows as p-vectors of regression coefficients at
          the turning points of the solution path

lam.path: penalty levels at the turning points of the computed path.
          When the penalty function is concave,  lam.path may not be a
          decreasing sequence but always takes nonnegative values.

    beta: matrix with rows as p-vector of coefficients when the
          solution path first hits lam

     lam: the specified penalty levels hit by lam.path. This may not be
          the same as argument lam if  the minimum of the argument is
          not reached by the computed solution path. 

     dim: the number of nonzero beta

r.square: R-square values for beta

total.hits: length of output lam

total.steps: total number of steps executed, the same as the total
          number of segments in the  computed solution path. With zero
          as the first coefficient vector, beta.path contains  one more
          vector than total.steps. 

full.path: TRUE if zero penalty is reached.

forced.stop: TRUE if PLUS is forced to stop due to reasons other than
          reaching max.steps or the  minimum of argument lam.

singular.Q: TRUE if PLUS is forced to stop when a matrix is not
          invertible.

_A_u_t_h_o_r(_s):

     Cun-Hui Zhang and Ofer Melnik

_R_e_f_e_r_e_n_c_e_s:

     Zhang, C.-H. (2007). Penalized linear unbiased selection.
     Technical Report No. 2007-003.  Department of Statistics, Rutgers
     University.

_S_e_e _A_l_s_o:

     print, plot, and predict methods

_E_x_a_m_p_l_e_s:

     data(sp500)
     attach(sp500)
     x <- sp500.percent[,3: (dim(sp500.percent)[2])] 
     y <- sp500.percent[,1]

     par(mfrow=c(2,3))
     object <- plus(x,y,method="lasso")
     plot(object)
     plot(object, yvar="dim")
     plot(object, yvar="R-sq")
     object <- plus(x,y,method="mc+")
     plot(object)
     plot(object, yvar="dim")
     plot(object, yvar="R-sq")
     detach(sp500)

