lga                   package:lga                   R Documentation

_P_e_r_f_o_r_m _L_G_A

_D_e_s_c_r_i_p_t_i_o_n:

     Linear Grouping Analysis

_U_s_a_g_e:

     lga(x, k, biter = NULL, niter = 10, showall = FALSE, scale = TRUE,
         nnode=NULL, silent=FALSE)

_A_r_g_u_m_e_n_t_s:

       x: a numeric matrix.

       k: an integer for the number of clusters.

   biter: an integer for the number of different starting hyperplanes
          to try.

   niter: an integer for the number of iterations to attempt for
          convergence.

 showall: logical.  If TRUE then display all the outcomes, not just the
          best one.

   scale: logical.  Allows you to scale the data, dividing each column
          by its standard deviation, before fitting.

   nnode: an integer of many CPUS to use for parallel processing. 
          Defaults to NULL i.e. no parallel processing.

  silent: logical.  If TRUE, produces no text output during processing.

_D_e_t_a_i_l_s:

     This code tries to find k clusters using the lga algorithm
     described in Van Aelst et al (2006).  For each attempt, it has up
     to 'niter' steps to get to convergence, and it does this from
     'biter' different starting hyperplanes.  It then selects the
     clustering with the smallest Residual Orthoganal Sum of Squareds.

     If 'biter' is left as NULL, then it is selected via the equation
     given in Van Aeslt et al (2006).

     This function is parallel computing aware via the 'nnode'
     argument, and works with the package 'snow'.  In order to use
     parallel computing, one of MPI (e.g. lamboot) or PVM is necessary.
     For further details, see the documentation for 'snow'.

     Associated with the lga function are a print method and a plot
     method (see the examples).  In the plot method, the fitted
     hyperplanes are also shown as dashed-lines.  When there are more
     than 2 dimensions, these represent the intersection of the fitted
     hyperplanes onto the hyperplanes for each pair of axes.

_V_a_l_u_e:

     An object of class '"lga"' with components 

 cluster: a vector containing the cluster memberships.

    ROSS: the Residual Orthogonal Sum of Squares for the solution.

converged: a logical. True if at least one solution has converged.

   biter: the biter setting used.

   niter: the niter setting used.

nconverg: the number of converged solutions (out of biter starts).

  scaled: logical. Is the data scaled?

       k: the number of clusters to be found.

       x: the (scaled if selected) dataset.

_A_u_t_h_o_r(_s):

     Justin Harrington harringt@stat.ubc.ca

_R_e_f_e_r_e_n_c_e_s:

     Van Aelst, S. and Wang, X. and Zamar, R. and Zhu, R. (2006)
     'Linear Grouping Using Orthogonal Regression', _Computational
     Statistics & Data Analysis_ *50*, 1287-1312.

_S_e_e _A_l_s_o:

     'gap'

_E_x_a_m_p_l_e_s:

     ## Synthetic Data
     ## Make a dataset with 2 clusters in 2 dimensions

     library(MASS)
     set.seed(1234)
     X <- rbind(mvrnorm(n=100, mu=c(1,-1), Sigma=diag(0.1,2)+0.9),
                 mvrnorm(n=100, mu=c(1,1), Sigma=diag(0.1,2)+0.9))

     lgaout <- lga(X,2)
     plot(lgaout)
     print(lgaout)

     ## nhl94 data set

     data(nhl94)
     plot(lga(nhl94, k=3, niter=30))

     ## Allometry data set
     data(brain)
     plot(lga(log(brain, base=10), k=3))

     ## Second Allometry data set
     data(ob)
     plot(lga(log(ob[,2:3]), k=3), pch=as.character(ob[,1]))

     ## Parallel processing case
     ## In this example, running using 4 nodes. 

     ## Not run: 
     set.seed(1234)
     X <- rbind(mvrnorm(n=1e6, mu=c(1,-1), Sigma=diag(0.1,2)+0.9),
                 mvrnorm(n=1e6, mu=c(1,1), Sigma=diag(0.1,2)+0.9))
     abc <- lga(X, k=2, nnode=4)
     ## End(Not run)

