Package: HDMD
Type: Package
Title: Statistical Analysis Tools for High Dimension Molecular Data
        (HDMD)
Version: 1.2
Date: 2013-2-26
Author: Lisa McFerrin
Maintainer: Lisa McFerrin <lgmcferr@ncsu.edu>
Depends: psych, MASS
Suggests: scatterplot3d
Description: High Dimensional Molecular Data (HDMD) typically have many
        more variables or dimensions than observations or replicates
        (D>>N).  This can cause many statistical procedures to fail,
        become intractable, or produce misleading results.  This
        package provides several tools to reduce dimensionality and
        analyze biological data for meaningful interpretation of
        results. Factor Analysis (FA), Principal Components Analysis
        (PCA) and Discriminant Analysis (DA) are frequently used
        multivariate techniques.  However, PCA methods prcomp and
        princomp do not reflect the proportion of total variation of
        each principal component.  Loadings.variation displays the
        relative and cumulative contribution of variation for each
        component by accounting for all variability in data. When D>>N,
        the maximum likelihood method cannot be applied in FA and the
        the principal axes method must be used instead, as in factor.pa
        of the psych package. The factor.pa.ginv function in this
        package further allows for a singular covariance matrix by
        applying a general inverse method to estimate factor scores.
        Moreover, factor.pa.ginv removes and warns of any variables
        that are constant, which would otherwise create an invalid
        covariance matrix. Promax.only further allows users to define
        rotation parameters during factor estimation.  Similar to the
        Euclidean distance, the Mahalanobis distance estimates the
        relationship among groups.  pairwise.mahalanobis computes all
        such pairwise Mahalanobis distances among groups and is useful
        for quantifying the separation of groups in DA. Genetic
        sequences are composed of discrete alphabetic characters, which
        makes estimates of variability difficult.  MolecularEntropy and
        MolecularMI calculate the entropy and mutual information to
        estimate variability and covariability, respectively, of DNA or
        Amino Acid sequences.  Functional grouping of amino acids
        (Atchley et al 1999) is also available for entropy and mutual
        information estimation.  Mutual information values can be
        normalized by NMI to account for the background distribution
        arising from the stochastic pairing of independent, random
        sites. Alternatively, discrete alphabetic sequences can be
        transformed into biologically informative metrics to be used in
        various multivariate procedures.  FactorTransform converts
        amino acid sequences using the amino acid indices determined by
        Atchley et al 2005.
License: GPL (>= 2)
Packaged: 2013-02-26 21:32:05 UTC; LisaMc
NeedsCompilation: no
Repository: CRAN
Date/Publication: 2013-02-27 07:31:03
Built: R 3.1.0; ; 2013-11-26 04:52:39 UTC; unix
