candisc               package:candisc               R Documentation

_C_a_n_o_n_i_c_a_l _d_i_s_c_r_i_m_i_n_a_n_t _a_n_a_l_y_s_i_s

_D_e_s_c_r_i_p_t_i_o_n:

     'candisc' performs a generalized canonical discriminant analysis
     for one term in a multivariate linear model (i.e., an 'mlm'
     object), computing canonical scores and vectors.  It represents a
     transformation of the original variables into a canonical space of
     maximal differences for the term, controlling for other model
     terms. To be of any use, the term should be a factor or
     interaction corresponding to a multivariate test with 2 or more
     degrees of freedom for the null hypothesis.

_U_s_a_g_e:

     candisc(mod, ...)

     ## S3 method for class 'mlm':
     candisc(mod, term, type = "2", manova, ndim = rank, ...)

     ## S3 method for class 'candisc':
     coef(object, type = c("std", "raw", "structure"), ...)

     ## S3 method for class 'candisc':
     plot(x, which = 1:2, conf = 0.95, col, pch, scale, asp = 1,
         var.col = "blue", var.lwd = par("lwd"), prefix = "Can", suffix=TRUE, 
         titles.1d = c("Canonical scores", "Structure"), ...)
         
     ## S3 method for class 'candisc':
     print(x, digits=max(getOption("digits") - 2, 3), ...)

     ## S3 method for class 'candisc':
     summary(object, means = TRUE, scores = FALSE, coef = c("std"),
         ndim, digits = max(getOption("digits") - 2, 4), ...)

_A_r_g_u_m_e_n_t_s:

     mod: An mlm object, such as computed by lm() with a multivariate
          response

    term: the name of one term from 'mod'

    type: type of test for the model 'term', one of: "II", "III", "2",
          or "3"

  manova: the 'Anova.mlm' object corresponding to 'mod'.  Normally,
          this is computed internally by  'Anova(mod)'

    ndim: Number of dimensions to store in (or retrieve from, for the
          'summary' method) the 'means', 'structure', 'scores' and
          'coeffs.*' components.  The default is the rank of the H
          matrix for the hypothesis term.

object, x: A candisc object

   which: A vector of two integers, selecting the canonical dimensions
          to plot

    conf: Confidence coefficient for the confidence circles plotted in
          the 'plot' method

     col: A vector of colors to be used for the levels of the term in
          the 'plot' method. In this version, you should assign colors
          and point symbols explicitly, rather than relying on the
          somewhat arbitrary defaults.

     pch: A vector of point symbols to be used for the levels of the
          term in the 'plot' method

   scale: Scale factor for the variable vectors in canonical space.  If
          not specified, a scale factor is calculated to make the
          variable vectors approximately fill the plot space. 

     asp: Aspect ratio for the 'plot' method.  The 'asp=1' (the
          default) assures that the units on the horizontal and
          vertical axes are the same, so that lengths and angles of the
          variable vectors are interpretable.

 var.col: Color used to plot variable vectors

 var.lwd: Line width used to plot variable vectors

  prefix: Prefix used to label the canonical dimensions plotted

  suffix: Suffix for labels of canonical dimensions. If 'suffix=TRUE'
          the percent of hypothesis (H) variance accounted for by each
          canonical dimension is added to the axis label.

titles.1d: A character vector of length 2, containing titles for the
          panels used to plot the canonical scores and structure
          vectors, for the case in which there is only one canonical
          dimension.

   means: Logical value used to determine if canonical means are
          printed

  scores: Logical value used to determine if canonical scores are
          printed

    coef: Type of coefficients printed by the summary method. Any one
          or more of "std", "raw", or "structure"

  digits: significant digits to print.

     ...: arguments to be passed down.  In particular, 'type="n"' can
          be used with the 'plot' method to suppress the display of
          canonical scores.

_D_e_t_a_i_l_s:

     Canonical discriminant analysis is typically carried out in
     conjunction with a one-way MANOVA design. It represents a linear
     transformation of the response variables into a canonical space in
     which (a) each successive canonical variate produces maximal
     separation among the groups (e.g., maximum univariate F
     statistics), and (b) all canonical variates are mutually
     uncorrelated.  For a one-way MANOVA with g groups and p responses,
     there are  'dfh' = min( g-1, p) such canonical dimensions, and
     tests, initally stated by Bartlett (1938) allow one to determine
     the number of significant  canonical dimensions.   Computational
     details for the one-way case are described in Cooley & Lohnes
     (1971), and in the SAS/STAT User's Guide,  "The CANDISC procedure:
     Computational Details," <URL:
     http://support.sas.com/onlinedoc/913/getDoc/en/statug.hlp/candisc_sect12.htm>.

     A generalized canonical discriminant analysis extends this idea to
     a general multivariate linear model.  Analysis of each term in the
     'mlm' produces a rank dfh H matrix sum of squares and
     crossproducts matrix that is  tested against the rank dfe E matrix
     by the standard multivariate tests (Wilks' Lambda,
     Hotelling-Lawley trace, Pillai trace, Roy's maximum root test). 
     For any given term in the 'mlm', the generalized canonical
     discriminant analysis amounts to a standard discriminant analysis
     based on the H matrix for that term in relation to the full-model
     E matrix.

_V_a_l_u_e:

     An object of class 'candisc' with the following components: 

    dfh : hypothesis degrees of freedom for 'term'

    dfe : error degrees of freedom for the 'mlm'

   rank : number of non-zero eigenvalues of HE^{-1}

eigenvalues : eigenvalues of HE^{-1}

 canrsq : squared canonical correlations

    pct : A vector containing the percentages of the 'canrsq' of their
          total.

   ndim : Number of canonical dimensions stored in the 'means',
          'structure' and 'coeffs.*' components

  means : A data.frame containing the class means for the levels of the
          factor(s) in the term

factors : A data frame containing the levels of the factor(s) in the
          'term'

   term : name of the 'term'

  terms : A character vector containing the names of the terms in the
          'mlm' object

coeffs.raw : A matrix containing the raw canonical coefficients

coeffs.std : A matrix containing the standardized canonical
          coefficients

structure : A matrix containing the canonical structure coefficients on
          'ndim' dimensions, i.e., the correlations between the
          original variates and the canonical scores. These are
          sometimes referred to as Total Structure Coefficients.

 scores : A data frame containing the predictors in the 'mlm' model and
          the canonical scores on 'ndim' dimensions.  These are
          calculated as 'Y %*% coeffs.raw', where 'Y' contains the
          standardized response variables.

_A_u_t_h_o_r(_s):

     Michael Friendly and John Fox

_R_e_f_e_r_e_n_c_e_s:

     Bartlett, M. S. (1938). Further aspects of the theory of multiple
     regression. Proc. Camb. Phil. Soc. 34, 33-34.

     Cooley, W.W. & Lohnes, P.R. (1971). Multivariate Data Analysis, 
     New York: Wiley.

     Gittins, R. (1985). Canonical Analysis: A Review with Applications
     in Ecology, Berlin: Springer.

_S_e_e _A_l_s_o:

     'candiscList', 'heplot',  'heplot3d'

_E_x_a_m_p_l_e_s:

     grass.mod <- lm(cbind(N1,N9,N27,N81,N243) ~ Block + Species, data=Grass)
     Anova(grass.mod,test="Wilks")

     grass.can1 <-candisc(grass.mod, term="Species")
     plot(grass.can1, type="n")

     # library(heplots)
     heplot(grass.can1, scale=6)

     # iris data
     iris.mod <- lm(cbind(Petal.Length, Sepal.Length, Petal.Width, Sepal.Width) ~ Species, data=iris)
     iris.can <- candisc(iris.mod, data=iris)
     #-- assign colors and symbols corresponding to species
     col <- rep(c("red", "black", "blue"), each=50)
     pch <- rep(1:3, each=50)
     plot(iris.can, col=col, pch=pch)

     heplot(iris.can)

     # 1-dim plot
     iris.can1 <- candisc(iris.mod, data=iris, ndim=1)
     plot(iris.can1)

