ltsReg                 package:rrcov                 R Documentation

_R_o_b_u_s_t _r_e_g_r_e_s_s_i_o_n _w_i_t_h _h_i_g_h _b_r_e_a_k_d_o_w_n _p_o_i_n_t

_D_e_s_c_r_i_p_t_i_o_n:

     Carries out least trimmed squares (LTS) regression.

_U_s_a_g_e:

     ltsReg(x, ...)

     ## S3 method for class 'formula':
     ltsReg(formula, data, ..., 
             model = TRUE, x.ret = FALSE, y.ret = FALSE)

     ## Default S3 method:
     ltsReg(x, y, intercept=TRUE, alpha=NULL, nsamp=500, 
             adjust=FALSE, mcd=TRUE, qr.out=FALSE, yname=NULL, seed=0, 
             use.correction=TRUE, control, ...)

_A_r_g_u_m_e_n_t_s:

 formula: a formula of the form 'y ~ x1 + x2 + ...'. 

    data: data frame from which variables specified in 'formula' are to
          be taken. 

model, x.ret, y.ret: logical. If 'TRUE' the model frame, the model
          matrix and the response are returned, respectively. 

       x: a matrix or data frame containing the explanatory variables. 

       y: the response: a vector of length the number of rows of 'x'. 

intercept: if true, a model with constant term will be estimated;
          otherwise no constant term will be included. Default is
          'intercept = TRUE'  

   alpha: the percentage of squared residuals whose sum will be
          minimized. Its default value is 0.5.  In general, alpha must
          be a value between 0.5 and 1. 

   nsamp: number of subsets used for initial estimates or '"best"' or
          '"exact"'.  Default is 'nsamp = 500'. If 'nsamp="best"'
          exhaustive enumeration  is done, as far as the number of
          trials do not exceed 5000. If 'nsamp="exact"'  exhaustive
          enumeration will be attempted however many samples are
          needed.  In this case a warning message will be displayed
          saying that the  computation can take a very long time. 

  adjust: whether to perform intercept adjustment at each step.  This
          could be quite time consuming, therefore the default is
          'adjust = FALSE' 

     mcd: whether to compute robust distances using Fast-MCD.

  qr.out: whether to return the QR decomposition. Default is 'qr.out =
          FALSE'

   yname: the name of the dependent variable. Default is 'yname = NULL'

    seed: starting value for random generator. Default is 'seed = 0'

use.correction: whether to use finite sample correction factors.
          Default is 'use.correction=TRUE'

 control: a list with estimation options - same as these provided in
          the  fucntion specification. If the control object is
          supplied, the parameters from it  will be used. If parameters
          are passed also in the invocation statement, they will 
          override the corresponding elements of the control object. 

     ...: arguments passed to or from other methods. 

_D_e_t_a_i_l_s:

     The LTS regression method minimizes the sum of the h smallest
     squared  residuals, where h must be at least half the number of
     observations. The  default value of h is roughly 0.5n where n is
     the total number of observations,  but the user may choose any
     value between n/2 and n.  The LTS estimate of the error scale is
     given by the minimum of the objective  function multiplied by a
     consistency factor and a finite sample correction  factor - see
     Pison et.al. (2002) for details. The rescaling factors for the raw
     and final estimates are  returned also in the vectors 'raw.cnp2'
     and 'cnp2' of length 2 respectively. The finite sample corrections
     can be suppressed by setting 'use.correction=FALSE'. The
     computations are performed  using the Fast LTS algorithm proposed
     by Rousseeuw and Van Driessen (1999).

     The formula interface has an implied intercept term.  This can be
     removed by using either 'y ~ x - 1' or 'y ~ 0 + x'.  See 'formula'
     for more details.

_V_a_l_u_e:

     The function 'ltsReg' returns an object of class '"lts"'.  The
     function 'summary' is used to obtain and print a summary table of
     the results.   The generic accessor functions 'coefficients',
     'fitted.values' and 'residuals' extract various useful features of
     the value returned by 'ltsReg'.

     An object of class 'lts' is a list containing at least the
     following components:

    crit: the value of the objective function of the LTS regression
          method, i.e. the sum of the h smallest squared raw residuals. 

coefficients: vector of coefficient estimates (including the
          intercept,when intercept=TRUE), obtained after reweighting 

    best: the best subset found and used for computing the raw
          estimates. The size of 'best' is equal to 'quan'. 

fitted.values: vector like y containing the fitted values of the
          response after reweighting. 

residuals: vector like y containing the residuals from the weighted
          least squares regression. 

   scale: scale estimate of the reweighted residuals.  

   alpha: same as the input parameter 'alpha'. 

    quan: the number h of observations that have determined the least
          trimmed squares estimator 

intercept: same as the input parameter 'intercept'.  

    cnp2: a vector of length two containing the consistency correction
          factor and the  finite sample correction factor of the final
          estimate of the error scale. 

raw.coefficients: vector of raw coefficient estimates (including the
          intercept,when intercept=TRUE). 

raw.scale: scale estimate of the raw residuals. 

raw.resid: vector like y containing the raw residuals from the
          regression. 

raw.cnp2: a vector of length two containing the consistency correction
          factor and the  finite sample correction factor of the raw
          estimate of the error scale. 

  lts.wt: vector like y containing weights that can be used in a
          weighted least squares. These weights are 1 for points with
          reasonably small raw residuals, and 0 for points with large
          raw residuals. 

  method: character string naming the method (Least Trimmed Squares). 

       X: the input data as a matrix. 

       Y: the response variable as a vector. 

_R_e_f_e_r_e_n_c_e_s:

     p. j. Rousseeuw (1984), Least Median of Squares Regression.
     _Journal of the American Statistical Association_, *79*, pp.
     871-881. 

     P. J. Rousseeuw and A. M. Leroy (1987)  _Robust Regression and
     Outlier Detection._ Wiley. 

     P. J. Rousseeuw and K. van Driessen (1999) Computing LTS
     Regression for Large Data Sets,  Technical Report, University of
     Antwerp, submitted

     P. J. Rousseeuw and K. van Driessen (1999)  A fast algorithm for
     the minimum covariance determinant estimator.  _Technometrics_,
     *41*, 212-223.

     Pison, G., Van Aelst, S., and Willems, G. (2002),  Small Sample
     Corrections for LTS and MCD,  _Metrika_, *55*, 111-123.

_S_e_e _A_l_s_o:

     'covMcd' 

     'summary.lts' for summaries.

     The generic functions 'coef', 'residuals', 'fitted'.

_E_x_a_m_p_l_e_s:

     data(heart)
     ltsReg(heart.x, heart.y)

     data(stackloss)
     ltsReg(stack.loss ~ ., data = stackloss)

