fwdglm                package:forward                R Documentation

_F_o_r_w_a_r_d _S_e_a_r_c_h _i_n _G_e_n_e_r_a_l_i_z_e_d _L_i_n_e_a_r _M_o_d_e_l_s

_D_e_s_c_r_i_p_t_i_o_n:

     This function applies the forward search approach to robust
     analysis in generalized linear models.

_U_s_a_g_e:

     fwdglm(formula, family, data, weights, na.action, contrasts = NULL, bsb = NULL, 
            balanced = TRUE, maxit = 50, epsilon = 1e-06, nsamp = 100, trace = TRUE)

_A_r_g_u_m_e_n_t_s:

 formula: a symbolic description of the model to be fit. The details of
          the model are the same as for glm.

  family: a description of the error distribution and link function to
          be used in the model. See `family' for details.

    data: an optional data frame containing the variables in the model.
          By default the variables are taken from the environment from
          which the function is called.

 weights: an optional vector of weights to be used in the fitting
          process.

na.action: a function which indicates what should happen when the data
          contain `NA's. The default is set by the `na.action' setting
          of `options', and is `na.fail' if that is unset. The default
          is `na.omit'.

contrasts: an optional list. See the `contrasts.arg' of
          `model.matrix.default'.

     bsb: an optional vector specifying a starting subset of
          observations to be used in the forward search. By default the
          ``best'' starting subset is chosen using the function
          'lmsglm' with control arguments provided by `nsamp'.

balanced: logical, for a binary response if 'TRUE' the proportion of
          successes on the full dataset is approximately balanced
          during the forward search algorithm.

   maxit: integer giving the maximal number of IWLS iterations. See
          'glm.control' for details.

 epsilon: positive convergence tolerance epsilon. See 'glm.control' for
          details.

   nsamp: the initial subset for the forward search in generalized
          linear models is found by the function 'lmsglm'. This
          argument allows to control how many subsets are used in the
          robust fitting procedure. The choices are: the number of
          samples (100 by the default) or `"all"'. Note that the
          algorithm tries to find `nsamp' good subsets or a maximum of
          2*`nsamp' subsets.

   trace: logical, if 'TRUE' a message is printed for every ten
          iterations completed during the forward search.

_V_a_l_u_e:

     The function returns an object of class `"fwdglm"' with the
     following components: 

    call: the matched call.

Residuals: a (n x (n-p+1)) matrix of residuals.

    Unit: a matrix of units added (to a maximum of 5 units) at each
          step.

included: a list with each element containing a vector of units
          included at each step of the forward search.

Coefficients: a ((n-p+1) x p) matrix of coefficients.

tStatistics: a ((n-p+1) x p) matrix of t statistics for the
          coefficients, i.e. coef.est/SE(coef.est).

Leverage: a (n x (n-p+1)) matrix of leverage values.

  MaxRes: a ((n-p) x 2) matrix of max deviance residuals in the best
          subsets and m-th deviance residuals.

MinDelRes: a ((n-p-1) x 2) matrix of minimum deviance residuals out of
          best subsets and (m+1)-th deviance residuals.

ScoreTest: a ((n-p) x 1) matrix of score test statistics for a goodness
          of link test.

Likelihood: a ((n-p) x 4) matrix with columns containing: deviance,
          residual deviance, psuedo R^2 (computed as
          1-deviance/null.deviance), dispersion parameter (computed as
          sum(pearson.residuals^2)/(m - p)).

CookDist: a ((n-p) x 1) matrix of forward Cook's distances.

ModCookDist: a ((n-p) x 5) matrix of forward modified Cook's distances
          for the units (to a maximum of 5 units) included at each
          step.

 Weights: a (n x (n-p)) matrix of weights used at each step of the
          forward search.

  inibsb: a vector giving the best starting subset chosen by 'lmsglm'.

binary.response: logical, equal to 'TRUE' if binary response.

_A_u_t_h_o_r(_s):

     Originally written for S-Plus by: Kjell Konis
     kkonis@insightful.com and Marco Riani mriani@unipr.it 
      Ported to R by Luca Scrucca luca@stat.unipg.it

_R_e_f_e_r_e_n_c_e_s:

     Atkinson, A.C. and Riani, M. (2000), _Robust Diagnostic Regression
     Analysis_, First Edition. New York: Springer, Chapter 6.

_S_e_e _A_l_s_o:

     'summary.fwdglm', 'plot.fwdglm', 'fwdlm', 'fwdsco'.

_E_x_a_m_p_l_e_s:

      
     data(cellular)
     cellular$TNF <- as.factor(cellular$TNF)
     cellular$IFN <- as.factor(cellular$IFN)
     mod <- fwdglm(y ~ TNF + IFN, data=cellular, family=poisson(log), nsamp=200)
     summary(mod)
     ## Not run: plot(mod)
     plot(mod, 1)
     plot(mod, 5)
     plot(mod, 6, ylim=c(-3, 20))
     plot(mod, 7)
     plot(mod, 8)

