mars                   package:mda                   R Documentation

_M_u_l_t_i_v_a_r_i_a_t_e _A_d_a_p_t_i_v_e _R_e_g_r_e_s_s_i_o_n _S_p_l_i_n_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Multivariate adaptive regression splines.

_U_s_a_g_e:

     mars(x, y, w, wp, degree, nk, penalty, thresh, prune, trace.mars,
          forward.step, prevfit, ...)

_A_r_g_u_m_e_n_t_s:

       x: a matrix containing the independent variables.

       y: a vector containing the response variable, or in the case of
          multiple responses, a matrix whose columns are the response
          values for each variable.

       w: an optional vector of observation weights.

      wp: an optional vector of response weights.

  degree: an optional integer specifying maximum interaction degree
          (default is 1).

      nk: an optional integer specifying the maximum number of model
          terms.

 penalty: an optional value specifying the cost per degree of freedom
          charge (default is 2).

  thresh: an optional value specifying forward stepwise stopping
          threshold (default is 0.001).

   prune: an optional logical value specifying whether the model should
          be pruned in a backward stepwise fashion (default is 'TRUE').

trace.mars: an optional logical value specifying whether info should be
          printed along the way (default is 'FALSE').

forward.step: an optional logical value specifying whether forward
          stepwise process should be carried out (default is 'TRUE').

 prevfit: optional data structure from previous fit.  To see the effect
          of changing the penalty paramater, one can use prevfit with
          'forward.step = FALSE'.

     ...: further arguments to be passed to or from methods.

_V_a_l_u_e:

     An object of class '"mars"', which is a list with the following
     components:

    call: call used to 'mars'.

all.terms: term numbers in full model.  '1' is the constant term. 
          Remaining terms are in pairs ('2 3', '4 5', and so on). 
          'all.terms' indicates nonsingular set of terms.

selected.terms: term numbers in selected model.

 penalty: the input penalty value.

  degree: the input degree value.

  thresh: the input threshold value.

     gcv: gcv of chosen model.

  factor: matrix with ij-th element equal to 1 if term i has a factor
          of the form x_j > c, equal to -1 if term i has a factor of
          the form x_j <= c, and to 0 if xj is not in term i.

    cuts: matrix with ij-th element equal to the cut point c for
          variable j in term i.

residuals: residuals from fit.

  fitted: fitted values from fit.

    lenb: length of full model.

coefficients: least squares coefficients for final model.

       x: a matrix of basis functions obtained from the input x matrix.

_N_o_t_e:

     This function was coded from scratch, and did not use any of
     Friedman's mars code.  It gives quite similar results to 
     Friedman's program in our tests, but not exactly the same results.
      We have not  implemented Friedman's anova decomposition nor are
     categorical predictors handled properly yet.  Our version does
     handle multiple response variables, however.  As it is not
     well-tested, we would like to hear of any bugs.

_A_u_t_h_o_r(_s):

     Trevor Hastie and Robert Tibshirani

_R_e_f_e_r_e_n_c_e_s:

     J. Friedman, ``Multivariate Adaptive Regression Splines'' (with
     discussion) (1991).  _Annals of Statistics_, *19*/1, 1-141.

_S_e_e _A_l_s_o:

     'predict.mars', 'model.matrix.mars'

_E_x_a_m_p_l_e_s:

     data(trees)
     fit1 <- mars(trees[,-3], trees[3])
     showcuts <- function(obj)
     {
       tmp <- obj$cuts[obj$sel, ]
       dimnames(tmp) <- list(NULL, names(trees)[-3])
       tmp
     }
     showcuts(fit1)

     ## examine the fitted functions
     par(mfrow=c(1,2), pty="s")
     Xp <- matrix(sapply(trees[1:2], mean), nrow(trees), 2, byrow=TRUE)
     for(i in 1:2) {
       xr <- sapply(trees, range)
       Xp1 <- Xp; Xp1[,i] <- seq(xr[1,i], xr[2,i], len=nrow(trees))
       Xf <- predict(fit1, Xp1)
       plot(Xp1[ ,i], Xf, xlab=names(trees)[i], ylab="", type="l")
     }

