oregMclust               package:edci               R Documentation

_O_r_t_h_o_g_o_n_a_l _R_e_g_r_e_s_s_i_o_n _C_l_u_s_t_e_r_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Computation of center points for regression data by orthogonal
     regression. A cluster method based on redescending M-estimators is
     used.

_U_s_a_g_e:

       oregMclust(datax, datay, bw,
                  method="const",
                  xrange=range(datax), yrange=range(datay), prec=4,
                  na=1, sa=NULL, nl=10, nc=NULL, brmaxit=1000)

       regparm(reg)

       plot.oregMclust(x, datax, datay, prec=3,
                       rcol="black", rlty=1, rlwd=3, ...)

       print.oregMclust(x, ...)

_A_r_g_u_m_e_n_t_s:

datax,datay: numerical vectors of coordinates of the observations.
          Alternativly, a matrix with two or three columns can be
          given. The first two columns are interpreted as coordinates
          of the observations and, if available, the third is passed to
          parameter 'sa'.

      bw: positive number. Bandwidth for the cluster method.

  method: optional string. Method of choosing starting values for
          maximization. Possible values are:

             *  "const": a constant number of angles for every
                observation is used. By default, one horizontal line
                through any observation is used as starting value. If
                'na' is given, 'na' lines through any observation are
                used. Alternatively, with the parameter 'sa' a proper
                starting angle for every observation can be specified.
                In this case, 'na' is ignored. The length of 'sa' must
                be the number of observations.

             *  "all": every line through any two observations is used.

             *  "prob": Clusters are searched iteratively with randomly
                chosen starting values until either no new clusters are
                found (default), or until 'nc' clusters are found. The
                precision of distinguishing the clusters can be tuned
                with the parameter 'prec'. In each iteration 'nl' times
                a line through two randomly chosen observations is used
                as starting value.

xrange, yrange: optional numerical intervals describing the domains of
          the observations. This is only used for normalization of the
          data. Note, that both intervals should have approximately the
          same length or should be transformed otherwise. This is not
          done automatically since this transformation affects the
          choice of the bandwidth.

    prec: optional positive integer. Tuning parameter for
          distinguishing different clusters, which is passed to
          'deldupMclust'.

      na: optional positive integer. Number of angles per observation
          used as starting values if 'method="const"' is chosen
          (default).

      sa: optional numerical vector. Angles (within [0,2pi)) used as
          starting values if 'method="const"' is chosen (default).

      nl: optional positive integer. Number of starting lines in each
          iteration, if 'method="prob"' is chosen.

      nc: optional positive integer. Number of clusters to search, if
          'method="prob"' is chosen. Note that, if 'nc' is to large,
          e.g. 'nc' clusters cannot be found, the function does not
          terminate. Attention! Using Windows, the routine cannot even
          be breaked manually in this case!

 brmaxit: optional positive integer. Since the maximization could be
          very slow in some cases depending on the starting value, the
          maximization is stopped after 'brmaxit' iterations.

   reg,x: Object returned from 'oregMclust'.

rcol,rlty,rlwd: optional graphic parameters used for plotting
          regression lines.

     ...: Additional parameters passed to 'plot'.

_D_e_t_a_i_l_s:

     'oregMclust' implements a cluster method based on redescending
     M-estimators for the case of orthogonal regression. This method is
     introduced by Mller and Garlipp in 2003 (see references).

     'regparm' transforms the columns "alpha" and "beta" to "intersept"
     and "slope".

     See also 'bestMclust', 'projMclust', and 'envMclust' for choosing
     the 'real' clusters out of the found.

_V_a_l_u_e:

     Return value is a numerical matrix containing one row for every
     found regression center line. The Columns "alpha" and "beta" are
     their parameters in the representation
     (cos(alpha),sin(alpha))(x,y)' = beta, where alpha is within
     [0,2pi). For representation y=mx+b, the return value can be passed
     to 'regparm'.

     The columns "value" and "count" give the value of the objective
     function and the number, how often they are found.

_A_u_t_h_o_r(_s):

     Tim Garlipp, garlipp@mathematik.uni-oldenburg.de

_R_e_f_e_r_e_n_c_e_s:

     Mller, C.H., Garlipp, T. (2003) Simple consistent cluster methods
     based on redescending M-estimators with an application to edge
     identification in images, to appear in _JMVA_.

_S_e_e _A_l_s_o:

     'bestMclust', 'projMclust', 'envMclust', 'deldupMclust'

_E_x_a_m_p_l_e_s:

       x <- c(rnorm(100,0,3),rnorm(100,5,3))
       y <- c(-2*x[1:100]-5,0.5*x[101:200]+30)/2
       x <- x + rnorm(200,0,0.5)
       y <- y + rnorm(200,0,0.5)

       reg <- oregMclust(x,y,1,method="prob")
       reg <- projMclust(reg,x,y)
       reg
       plot(bestMclust(reg,2,crit="proj"),x,y)

