mda                   package:mda                   R Documentation

_M_i_x_t_u_r_e _D_i_s_c_r_i_m_i_n_a_n_t _A_n_a_l_y_s_i_s

_D_e_s_c_r_i_p_t_i_o_n:

     Mixture discriminant analysis.

_U_s_a_g_e:

     mda(formula, data, subclasses, sub.df, tot.df, dimension, eps,
         iter, weights, method, keep.fitted, trace, ...)

_A_r_g_u_m_e_n_t_s:

 formula: of the form 'y~x' it describes the response and the
          predictors.  The formula can be more complicated, such as
          'y~log(x)+z' etc (see 'formula' for more details). The
          response should be a factor representing the response
          variable, or any vector that can be coerced to such (such as
          a logical variable).

    data: data frame containing the variables in the formula
          (optional).

subclasses: Number of subclasses per class, default is 3.  Can be a
          vector with a number for each class.

  sub.df: If subclass centroid shrinking is performed, what is the
          effective degrees of freedom of the centroids per class.  Can
          be a scalar, in which case the same number is used for each
          class, else a vector.

  tot.df: The total df for all the centroids can be specified rather
          than separately per class.

dimension: The dimension of the reduced model.  If we know our final
          model will be confined to a discriminant subspace (of the
          subclass centroids), we can specify this in advance and have
          the EM algorithm operate in this subspace.

     eps: A numerical threshold for automatically truncating the
          dimension.

    iter: A limit on the total number of iterations,  default is 5.

 weights: _NOT_ observation weights!  This is a special weight
          structure, which for each class assigns a weight (prior
          probability) to each of the observations in that class of
          belonging to one of the subclasses.  The default is provided
          by a call to 'mda.start(x, g, subclasses, trace, ...)' (by
          this time 'x' and 'g' are known).  See the help for
          'mda.start'.  Arguments for 'mda.start' can be provided via
          the '...' argument to mda, and the 'weights' argument need
          never be accessed.  A previously fit mda object can be
          supplied, in which case the final subclass 'responsibility'
          weights are used for 'weights'.  This  allows the iterations
          from a previous fit to be continued.

  method: regression method used in optimal scaling.  Default is linear
          regression via the function 'polyreg', resulting in the usual
          mixture model.  Other possibilities are 'mars' and  'bruto'. 
          For penalized mixture discriminant models 'gen.ridge' is
          appropriate.

keep.fitted: a logical variable, which determines whether the
          (sometimes large) component '"fitted.values"' of the 'fit'
          component of the returned 'mda' object should be kept.  The
          default is 'TRUE' if 'n * dimension < 1000'.

   trace: if 'TRUE', iteration information is printed.  Note that the
          deviance reported is for the posterior class likelihood, and
          not the full likelihood, which is used to drive the EM
          algorithm under 'mda'.  In general the latter is not
          available.

     ...: additional arguments to 'mda.start' and to 'method'.

_V_a_l_u_e:

     An object of class 'c("mda", "fda")'.  The most useful extractor
     is 'predict', which can make many types of predictions from this
     object.  It can also be plotted, and any functions useful for fda
     objects will work here too, such as 'confusion' and 'coef'.

     The object has the following components: 

percent.explained: the percent between-group variance explained by each
          dimension (relative to the total explained.)

  values: optimal scaling regresssion sum-of-squares for each dimension
          (see reference).

   means: subclass means in the discriminant space.  These are also
          scaled versions of the final theta's or class scores, and can
          be used in a subsequent call to 'mda' (this only makes sense
          if some columns of theta are omitted-see the references)

theta.mod: (internal) a class scoring matrix which allows 'predict' to
          work properly.

dimension: dimension of discriminant space.

sub.prior: subclass membership priors, computed in the fit.  No effort
          is currently spent in trying to keep these above a threshold.

   prior: class proprotions for the training data.

     fit: fit object returned by 'method'.

    call: the call that created this object (allowing it to be
          'update'-able).

confusion: confusion matrix when classifying the training data.

 weights: These are the subclass membership probabilities for each
          member of the training set; see the weights argument.

assign.theta: a pointer list which identifies which elements of certain
          lists belong to individual classes.

deviance: The multinomial log-liklihood of the fit.  Even though the
          full log-likelihood drives the iterations, we cannot in
          general compute it because of the flexibility of the 'method'
          used. The deviance can increase with the iterations, but
          generally does not.


     The 'method' functions are required to take arguments 'x' and 'y'
     where both can be matrices, and should produce a matrix of
     'fitted.values' the same size as 'y'.  They can take additional
     arguments 'weights' and should all have a '...' for safety sake. 
     Any arguments to method() can be passed on via the '...' argument
     of 'mda'.  The default method 'polyreg' has a 'degree' argument
     which allows polynomial regression of the required total degree. 
     See the documentation for 'predict.fda' for further requirements
     of 'method'.

     The function 'mda.start' creates the starting weights; it takes
     additional arguments which can be passed in via the '...' argument
     to 'mda'.  See the documentation for 'mda.start'.

_N_o_t_e:

     This software it is not well-tested, we would like to hear of any
     bugs.

_A_u_t_h_o_r(_s):

     Trevor Hastie and Robert Tibshirani

_R_e_f_e_r_e_n_c_e_s:

     ``Flexible Disriminant Analysis by Optimal Scoring'' by Hastie,
     Tibshirani and Buja, 1994, JASA, 1255-1270.

     ``Penalized Discriminant Analysis'' by Hastie, Buja and
     Tibshirani, Annals of Statistics, 1995 (in press).

     ``Discriminant Analysis by Gaussian Mixtures'' by Hastie and
     Tibshirani, 1994, JRSS-B (in press).

_S_e_e _A_l_s_o:

     'predict.mda', 'mars', 'bruto', 'polyreg', 'gen.ridge', 'softmax',
     'confusion'

_E_x_a_m_p_l_e_s:

     data(iris)
     irisfit <- mda(Species ~ ., data = iris)
     irisfit
     ## Call:
     ## mda(formula = Species ~ ., data = iris)
     ##
     ## Dimension: 4
     ##
     ## Percent Between-Group Variance Explained:
     ##     v1     v2     v3     v4
     ##  96.02  98.55  99.90 100.00
     ##
     ## Degrees of Freedom (per dimension): 5
     ##
     ## Training Misclassification Error: 0.02 ( N = 150 )
     ##
     ## Deviance: 15.102

     data(glass)
     # random sample of size 100
     samp <- c(1, 3, 4, 11, 12, 13, 14, 16, 17, 18, 19, 20, 27, 28, 31,
               38, 42, 46, 47, 48, 49, 52, 53, 54, 55, 57, 62, 63, 64, 65,
               67, 68, 69, 70, 72, 73, 78, 79, 83, 84, 85, 87, 91, 92, 94,
               99, 100, 106, 107, 108, 111, 112, 113, 115, 118, 121, 123,
               124, 125, 126, 129, 131, 133, 136, 139, 142, 143, 145, 147,
               152, 153, 156, 159, 160, 161, 164, 165, 166, 168, 169, 171,
               172, 173, 174, 175, 177, 178, 181, 182, 185, 188, 189, 192,
               195, 197, 203, 205, 211, 212, 214) 
     glass.train <- glass[samp,]
     glass.test <- glass[-samp,]
     glass.mda <- mda(Type ~ ., data = glass.train)
     predict(glass.mda, glass.test, type="post") # abbreviations are allowed
     confusion(glass.mda,glass.test)

