plot2DProjection      package:clusterGeneration      R Documentation

_P_L_O_T _A _P_A_I_R _O_F _C_L_U_S_T_E_R_S _A_L_O_N_G _A _2-_D _P_R_O_J_E_C_T_I_O_N _S_P_A_C_E

_D_e_s_c_r_i_p_t_i_o_n:

     Plot a pair of clusters along a 2-D projection space.

_U_s_a_g_e:

     plot2DProjection(y1, y2, projDir, 
       sepValMethod=c("normal", "quantile"), 
       iniProjDirMethod=c("SL", "naive"), 
       projDirMethod=c("newton", "fixedpoint"), 
       xlim=NULL, ylim=NULL, 
       xlab="1st projection direction", 
       ylab="2nd projection direction", 
       title="Scatter plot of 2-D Projected Clusters",
       font=2, font.lab=2, cex=1.2, cex.lab=1, cex.main=1.5,
       lwd=4, lty1=1, lty2=2, pch1=18, pch2=19, col1=2, col2=4, 
       alpha=0.05, ITMAX=20, eps=1.0e-10, quiet=TRUE)

_A_r_g_u_m_e_n_t_s:

      y1: Data matrix of cluster 1. Rows correspond to observations.
          Columns correspond to variables. 

      y2: Data matrix of cluster 2. Rows correspond to observations.
          Columns correspond to variables. 

 projDir: 1-D projection direction along which two clusters will be
          projected. 

sepValMethod: Method to calculate separation index for a pair of
          clusters projected onto a  1-D space.
          'sepValMethod="quantile"' indicates the quantile version of
          separation index will be used: $sepVal=(L_2-U_1)/(U_2-L_1)$
          where $L_i$ and  $U_i$, $i=1, 2$, are the lower and upper
          'alpha/2' sample percentiles  of projected cluster $i$.
          'sepValMethod="normal"' indicates the  normal version of
          separation index will be used: 
          $sepVal=[(\bar{x}_2-\bar{x}_1)-z_{alpha/2}(s_1+s_2)]/
          [(\bar{x}_2-\bar{x}_1)+z_{alpha/2}(s_1+s_2)]$,  where
          $\bar{x}_i$ and $s_i$ are the sample mean and standard
          deviation  of projected cluster $i$. 

iniProjDirMethod: Indicating the method to get initial projection
          direction when calculating the separation index between a
          pair of clusters (c.f. Qiu and Joe, 2006a, 2006b). 
           'iniProjDirMethod'$=$"SL" indicates the initial projection 
          direction is the sample version of the SL's projection
          direction  (Su and Liu, 1993)
          (boldsymbol{Sigma}_1+boldsymbol{Sigma}_2)^{-1}(boldsymbol{mu}_2-boldsymbol{mu}_1)
           'iniProjDirMethod'$=$"naive" indicates the initial
          projection  direction is boldsymbol{mu}_2-boldsymbol{mu}_1 

projDirMethod: Indicating the method to get the optimal projection
          direction when calculating  the separation index between a
          pair of clusters (c.f. Qiu and Joe, 2006a, 2006b). 
           'projDirMethod'$=$"newton" indicates we use the
          Newton-Raphson  method to search the optimal projection
          direction (c.f. Qiu and Joe, 2006a).  This requires the
          assumptions that both covariance matrices of the pair of 
          clusters are positive-definite. If this assumption is
          violated, the  "fixedpoint" method could be used. The
          "fixedpoint" method  iteratively searches the optimal
          projection direction based on the first  derivative of the
          separation index to the project direction  (c.f. Qiu and Joe,
          2006b). 

    xlim: Range of X axis. 

    ylim: Range of Y axis. 

    xlab: X axis label. 

    ylab: Y axis label. 

   title: Title of the plot. 

    font: An integer which specifies which font to use for text (see
          'par'). 

font.lab: The font to be used for x and y labels (see 'par'). 

     cex: A numerical value giving the amount by which plotting text
          and symbols should be scaled relative to the default (see
          'par'). 

 cex.lab: The magnification to be used for x and y labels relative to
          the current setting of 'cex' (see 'par'). 

cex.main: The magnification to be used for main titles relative to the
          current setting of 'cex' (see 'par'). 

     lwd: The line width, a _positive_ number, defaulting to '1' (see
          'par'). 

    lty1: Line type for cluster 1 (see 'par'). 

    lty2: Line type for cluster 2 (see 'par'). 

    pch1: Either an integer specifying a symbol or a single character
          to be used as the default in plotting points for cluster 1
          (see 'points'). 

    pch2: Either an integer specifying a symbol or a single character
          to be used as the default in plotting points for cluster 2
          (see 'points'). 

    col1: Color to indicates cluster 1. 

    col2: Color to indicates cluster 2. 

   alpha: Tuning parameter reflecting the percentage in the two tails
          of a projected cluster that might be outlying. 

   ITMAX: Maximum iteration allowed when iteratively calculating the
          optimal projection direction. The actual number of iterations
          is usually much less than the default value 20. 

     eps: A small positive number to check if a quantitiy q is equal to
          zero.   If |q|<'eps', then we regard q as equal to zero.  
          'eps' is used to check the denominator in the formula of the
          separation  index is equal to zero. Zero-value denominator
          indicates two clusters are  totally overlapped. Hence the
          separation index is set to be $-1$. The default value of
          'eps' is 1.0e-10. 

   quiet: A flag to switch on/off the outputs of intermediate results
          and/or possible warning messages. The default value is
          'TRUE'. 

_D_e_t_a_i_l_s:

     To get the second projection direction, we first construct an
     orthogonal  matrix with first column 'projDir'. Then we rotate the
     data points  according to this orthogonal matrix. Next, we remove
     the first dimension  of the rotated data points, and obtain the
     optimal projection direction  $projDir2$ for the rotated data
     points in the remaining dimensions.  Finally, we rotate the vector
     $projDir3=(0, projDir2)$ back to the original space.  The vector
     $projDir3$ is the second projection direction.

     The ticks along X axis indicates the positions of points of the
     projected  two clusters. The positions of $L_i$ and $U_i$, $i=1,
     2$, are also indicated  on X axis, where $L_i$ and $U_i$ are the
     lower and upper $alpha/2$ sample  percentiles of cluster $i$ if
     'sepValMethod="quantile"'.  If 'sepValMethod="normal"',
     $L_i=\bar{x}_i-z_{alpha/2}s_i$, where $\bar{x}_i$ and $s_i$ are
     the  sample mean and standard deviation of cluster $i$, and
     $z_{alpha/2}$  is the upper $alpha/2$ percentile of standard
     normal distribution.

_V_a_l_u_e:

 sepValx: value of the separation index for the projected two clusters
          along the 1st projection direction. 

 sepValy: value of the separation index for the projected two clusters
          along the 2nd projection direction. 

      Q2: 1st column is the 1st projection direction. 2nd column is the
          2nd projection direction. 

_A_u_t_h_o_r(_s):

     Weiliang Qiu stwxq@channing.harvard.edu
      Harry Joe harry@stat.ubc.ca

_R_e_f_e_r_e_n_c_e_s:

     Qiu, W.-L. and Joe, H. (2006a) Generation of Random Clusters with
     Specified Degree of Separaion. _Journal of Classification_,
     *23*(2), 315-334.

     Qiu, W.-L. and Joe, H. (2006b) Separation Index and Partial
     Membership for Clustering. _Computational Statistics and Data
     Analysis_, *50*, 585-603.

_S_e_e _A_l_s_o:

     'plot1DProjection' 'viewClusters'

_E_x_a_m_p_l_e_s:

     n1<-50
     mu1<-c(0,0)
     Sigma1<-matrix(c(2,1,1,5),2,2)
     n2<-100
     mu2<-c(10,0)
     Sigma2<-matrix(c(5,-1,-1,2),2,2)
     projDir<-c(1, 0)

     library(MASS)
     set.seed(1234)
     y1<-mvrnorm(n1, mu1, Sigma1)
     y2<-mvrnorm(n2, mu2, Sigma2)
     y<-rbind(y1, y2)
     cl<-rep(1:2, c(n1, n2))

     b<-getSepProjData(y, cl, iniProjDirMethod="SL", projDirMethod="newton")
     # projection direction for clusters 1 and 2
     projDir<-b$projDirArray[1,2,]

     par(mfrow=c(2,1))
     plot1DProjection(y1, y2, projDir)
     plot2DProjection(y1, y2, projDir)

