Weka_clusterers            package:RWeka            R Documentation

_R/_W_e_k_a _C_l_u_s_t_e_r_e_r_s

_D_e_s_c_r_i_p_t_i_o_n:

     R interfaces to Weka clustering algorithms.

_U_s_a_g_e:

     Cobweb(x, control = NULL)
     FarthestFirst(x, control = NULL)
     SimpleKMeans(x, control = NULL)
     XMeans(x, control = NULL)
     DBScan(x, control = NULL)

_A_r_g_u_m_e_n_t_s:

       x: an R object with the data to be clustered.

 control: an object of class 'Weka_control', or a character vector of
          control options, or 'NULL' (default). Available options can
          be obtained on-line using the Weka Option Wizard 'WOW', or
          the Weka documentation.

_D_e_t_a_i_l_s:

     There is a 'predict' method for predicting class ids or
     memberships from the fitted clusterers.

     'Cobweb' implements the Cobweb (Fisher, 1987) and Classit (Gennari
     et al., 1989) clustering algorithms.

     'FarthestFirst' provides the farthest first traversal algorithm
     by Hochbaum and Shmoys, which works as a fast simple approximate
     clusterer modeled after simple k-means.

     'SimpleKMeans' provides clustering with the k-means algorithm.

     'XMeans' provides k-means extended by an Improve-Structure part
     and automatically determines the number of clusters.

     'DBScan' provides the density-based clustering algorithm by
     Ester, Kriegel, Sander, and Xu. Note that noise points are
     assigned to 'NA'.

_V_a_l_u_e:

     A list inheriting from class 'Weka_clusterers' with components
     including 

clusterer: a reference (of class 'jobjRef') to a Java object obtained
          by applying the Weka 'buildClusterer' method to the training
          instances using the given control options.

class_ids: a vector of integers indicating the class to which each
          training instance is allocated (the results of calling the
          Weka 'clusterInstance' method for the built clusterer and
          each instance).

_R_e_f_e_r_e_n_c_e_s:

     M. Ester, H.-P. Kriegel, J. Sander, and X. Xu (1996). A
     Density-Based Algorithm for Discovering Clusters in Large Spatial
     Databases with Noise. _Proceedings of the Second International
     Conference on Knowledge Discovery and Data Mining (KDD'96)_,
     Portland, OR, 226-231. AAAI Press.

     D. H. Fisher (1987). Knowledge acquisition via incremental
     conceptual clustering. _Machine Learning_, *2*/2, 139-172.

     J. Gennari, P. Langley, and D. H. Fisher (1989). Models of
     incremental concept formation. _Artificial Intelligence_, *40*,
     11-62.

     D. S. Hochbaum and D. B. Shmoys (1985). A best possible heuristic
     for the k-center problem, _Mathematics of Operations Research_,
     *10*(2), 180-184.

     D. Pelleg and A. W. Moore (2006). X-means: Extending K-means with
     Efficient Estimation of the Number of Clusters. In: _Seventeenth
     International Conference on Machine Learning_, 727-734. Morgan
     Kaufmann.

     I. H. Witten and E. Frank (2005). _Data Mining: Practical Machine
     Learning Tools and Techniques_. 2nd Edition, Morgan Kaufmann, San
     Francisco.

_E_x_a_m_p_l_e_s:

     cl1 <- SimpleKMeans(iris[, -5], Weka_control(N = 3))
     cl1
     table(predict(cl1), iris$Species)

     ## Use XMeans with a KDTree.
     cl2 <- XMeans(iris[, -5],
                   c("-L", 3, "-H", 7, "-use-kdtree",
                     "-K", "weka.core.neighboursearch.KDTree -P"))
     cl2
     table(predict(cl2), iris$Species)

