mixreg                package:mixreg                R Documentation

_F_i_t _a _m_i_x_t_u_r_e _o_f _l_i_n_e_a_r _r_e_g_r_e_s_s_i_o_n_s.

_D_e_s_c_r_i_p_t_i_o_n:

     Estimates the parameters for a mixture of linear regressions,
     assuming Gaussian errors, using the EM algorithm.

_U_s_a_g_e:

     mixreg(x, y, ncomp=2, intercept=TRUE, eq.var=FALSE,
            theta.start=NULL, itmax=1000, eps=1e-06, verb=TRUE,
            digits=7, max.try=5, data.name=NULL)

_A_r_g_u_m_e_n_t_s:

       x: A matrix of predictors for each of the regression models in
          the mixture.  It should NOT include an initial column of 1s. 
          If there is only one predictor, x may be a vector. 

       y: The vector of responses for the regression models. 

   ncomp: The number of components in the mixture. 

intercept: Logical argument specifying whether the linear regressions
          should have intercepts fitted. 

  eq.var: Logical argument specifying whether the error variance should
          be the same for all components, or each component should be
          allowed a different error variance. 

theta.start: A list giving starting values for the estimation
          procedure.  Each component of the list is in turn a list with
          components beta (vector of linear coefficients), sigsq
          (variance) and lambda (mixing probability).  If eq.var is
          TRUE, then it is sensible to have all the starting values of
          sigsq equal, but this is not strictly necessary.  If
          theta.start is not specified, starting values are generated
          randomly.  This is NOT recommended. 

   itmax: The maximum number of EM steps to be undertaken. 

     eps: A value specifying the convergence criterion for the EM
          algorithm. If the maximum absolute value of the change in the
          parameters is less than eps the algorithm is considered to
          have converged. 

    verb: Logical argument; if verb is TRUE then details of the
          progress of the algorithm are printed out at each EM step. 

  digits: The number of digits to which the details are printed out,
          when verb is TRUE. 

 max.try: If the algorithm encounters a singularity in the likelihood
          (as may occur when eq.var is FALSE) the algorithm is
          restarted using new (randomly generated) starting values. 
          The restart is attempted a maximum of max.try times. 

data.name: A character string specifying a name associated with the
          data being analyzed, for identification purposes. 

_D_e_t_a_i_l_s:

     Even if eq.var is TRUE, each component of theta still has its own
     sigsq component.  The values of these will all be equal however if
     eq.var is TRUE.

_V_a_l_u_e:

     A list, of class mixreg, with components

  parmat: The parameters of the fitted model arranged as a matrix, each
          row corresponding to one component of the mixture. 

   theta: The parameters of the fitted model as a list, each entry of
          the list being itself a list (like those in theta.start)
          corresponding to one component of the mixture. 

log.like: The log likelihood of the fitted model, based on Gaussian
          errors. 

     aic: The Akaike Information Criterion value for the fitted model;
          aic is equal to -2 * log.like + 2*M where M is the number of
          parameters in the model. 

intercept: The intercept argument of the call to mixreg. 

  eq.var: The eq.var argument of the call to mixreg. 

    bnms: A vector of names associated with the linear components of
          the regression models.  The names are formed from the column
          names of the argument x if these exist; otherwise they are
          "beta1", "beta2", .... The name "Int" is prepended if
          intecept is TRUE. 

  nsteps: The number of steps the EM algorithm took to converge. 

converged: Logical value indicating whether the algorithm did converge
          or stopped because it reach the itmax EM step. 

data.name: The data.name argument if supplied; otherwise is formed as
          "name-of-y.on.name-of-x". 

_R_e_f_e_r_e_n_c_e_s:

     Turner, T. R.  (2000) Estimating the rate of spread of a viral
     infection of potato plants via mixtures of regressions.  Appl.
     Statist. vol. 49, Part 3, pp. 371 - 384.

     Dempster, A. P., Laird, N. M., and Rubin, D. B.  Maximum
     likelihood from incomplete data via the EM algorithm, J. Royal
     Statist. Soc. B, vol. 39, pp. 1-22, 1977.

_S_e_e _A_l_s_o:

     bootcomp, cband, covmix, plot.cband, plot.mresid, qq.mix,
     resid.mix

_E_x_a_m_p_l_e_s:

     data(aphids)
     x   <- aphids$n.aphids
     y   <- aphids$n.inf
     TS  <- list(list(beta=c(3.0,0.1),sigsq=16,lambda=0.5),
                 list(beta=c(0.0,0.0),sigsq=16,lambda=0.5))
     fit <- mixreg(x,y,ncomp=2,theta.start=TS,data.name='aphids')
     cvm <- covmix(fit,x,y)
     cbd <- cband(fit,cvm,x,y)
     plot(cbd)
     r <- resid.mix(fit,x,y)
     plot(r)
     r <- resid.mix(fit,x,y,std=TRUE)
     qq.mix(r)

