predict                package:arules                R Documentation

_M_o_d_e_l _P_r_e_d_i_c_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Provides the S4 method 'predict' for 'itemMatrix' (e.g.,
     transactions).  Predicts the membership (nearest neighbor) of new
     data to clusters represented by medoids or labeled examples.

_U_s_a_g_e:

     ## S4 method for signature 'itemMatrix':
     predict(object, newdata, labels = NULL, blocksize = 200,...)

_A_r_g_u_m_e_n_t_s:

  object: medoids (no labels needed) or examples (labels needed). 

 newdata: objects to predict labels for. 

  labels: an integer vector containing the labels for the examples in
          'object'. 

blocksize: a numeric scalar indicating how much memory predict can use
          for big 'x' and/or 'y' (approx. in MB). This is only a crude
          approximation for 32-bit machines (64-bit architectures need
          double the blocksize in memory) and using the default Jaccard
          method for dissimilarity calculation.  In general, reducing
          'blocksize' will decrease the memory usage but will increase
          the run-time.

     ...: further arguments passed on to 'dissimilarity'. E.g., 
          'method'.

_V_a_l_u_e:

     An integer vector of the same length as 'newdata'   containing the
     predicted labels for each element.

_S_e_e _A_l_s_o:

     'dissimilarity', 'itemMatrix-class'

_E_x_a_m_p_l_e_s:

     data("Adult")

     ## sample
     small <- sample(Adult, 500)
     large <- sample(Adult, 5000)

     ## cluster a small sample
     d_jaccard <- dissimilarity(small)
     hc <- hclust(d_jaccard)
     l <-  cutree(hc, k=4)

     ## predict labels for a larger sample
     labels <- predict(small, large, l)

     ## plot the profile of the 1. cluster
     itemFrequencyPlot(large[labels==1, itemFrequency(large) > 0.1])

