fwdmv                 package:Rfwdmv                 R Documentation

_M_u_l_t_i_v_a_r_i_a_t_e _F_o_r_w_a_r_d _S_e_a_r_c_h

_D_e_s_c_r_i_p_t_i_o_n:

     This function computes a multivariate forward search.  Several
     diagnostic statistics are monitored during the search: see
     'fwdmv.object'.

_U_s_a_g_e:

     fwdmv(X, groups = NULL, alpha = 0.6, beta = 0.75, bsb = ellipse.subset, balanced = TRUE, scaled = TRUE, constrained = TRUE, monitor = "all")

_A_r_g_u_m_e_n_t_s:

       X: a matrix or data frame containing a multivariate data set.

  groups: a list of one or more integer vectors specifying the
          tentative groups.  All elements must be unique.  Units not
          belonging to any group are classified as unassigned.  If
          omitted, all of the data are assumed to come from a single
          multivariate normal population.

   alpha: a numeric value between 0 and 1 specifying the fraction of
          the units in each group that will be included in the initial
          subset.

    beta: a numeric value between 'alpha' and 1 specifying the fraction
          of the units in each tentative group that must be included in
          the subset before the unassigned units are allowed to be
          included.  A large value of 'beta' insures that the centroid
          and variance-covariance matrix estimates stabilize before the
          unassigned units enter the subsets.

     bsb: a function of two variables: the multivariate data in matrix
          form 'X' and the number of units in the initial subset
          'size'.  The default 'bsb = ellipse.subset' computes the
          initial subset using robustly centered ellipses.  Other
          choices include 'bsb = bb.subset' to compute the initial
          subset using bivariate boxplots, 'bsb = mcd.subset' to
          compute the initial subset using mcd distances, and 'bsb =
          random.subset' for a randomly determined initial subset. 
          Alternatively, the initial subset my be specified directly by
          providing an integer vector containing the indices of the
          units to be in the initial subset.

balanced: a logical value.  If 'TRUE' then units are added to the
          subset so that the group ratios in the subset stay as close
          as possible to the group ratio in the data.

  scaled: a logical value.  If 'TRUE' then the Mahalanobis distances
          are scaled using the 2p root of the determinant.  This is
          intended to compensate for clusters with significantly
          different dispersions.

constrained: a logical value.  If TRUE then the forward search chooses
          units from the tentative groups until each group is 'beta'
          full; then unassigned units are allowed into the subset.  If
          FALSE then unassigned units may enter the subset at anytime
          during the forward search.  Note that when 'constrained ==
          F', the argument 'balanced' is ignored.

 monitor: a character vector specifying which statistics are to be
          monitored during the forward search.  The default value "all"
          monitors all statistics.  Otherwise choose from "distance",
          "center", "cov", "determinant", "unit", "max", "mth", "min",
          "mpo", "nearest", and "misclassified".

_D_e_t_a_i_l_s:

     Initial group subsets of size 'alpha * nbsb[i]' (where 'nbsb[i]'
     is the number of units assigned to tentative group i) are obtained
     by running the initialization function on each group.  Estimates
     of the center and covariance matrix are computed for each group
     using the units currently in the group subset.  The Mahalanobis
     distance for each unit in a tentative group is computed using the
     center and covariance matrix estimates for that group.  The
     Mahalanobis distance for each unassigned unit is computed by
     calculating the distance to each group and taking the minimum.  If
     the search is balanced then one unit is added to the subset that
     is currently the furthest below the population ratio.  If the
     search is not balanced then the unit (not in any subset) with the
     smallest distance is allocated to the nearest group.  If the
     search is constrained then the unassigned units are not allowed
     into the group subsets until each group subset contains a fraction
     'beta' of the units in the tentative groups.  If the search is not
     constrained then the unassigned units may enter the subset at any
     time during the search.

_V_a_l_u_e:

     a list with class 'fwdmv'.

_A_u_t_h_o_r(_s):

     Kjell Konis

_R_e_f_e_r_e_n_c_e_s:

     Atkinson, A. C., Riani, M. and Cerioli, A. (2004) Exploring
     Multivariate Data with the Forward Search. Springer-Verlag New
     York.

_S_e_e _A_l_s_o:

     'fwdmv.object'

_E_x_a_m_p_l_e_s:

     data(fondi.dat)

     g1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 51, 53, 55, 56)

     g2 <- c(57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103)

     fondi.fwdmv <- fwdmv(fondi.dat, groups = list(g1, g2))

