xtabs                 package:Matrix                 R Documentation

_C_r_o_s_s _T_a_b_u_l_a_t_i_o_n, _O_p_t_i_o_n_a_l_l_y _S_p_a_r_s_e

_D_e_s_c_r_i_p_t_i_o_n:

     Create a contingency table from cross-classifying factors, usually
     contained in a data frame, using a formula interface.

     This is a fully compatible extension of the standard 'stats'
     package 'xtabs()' function with the added option to produce a
     _sparse_ matrix result via 'sparse = TRUE'.

_U_s_a_g_e:

     xtabs(formula = ~., data = parent.frame(), subset, sparse = FALSE, na.action,
           exclude = c(NA, NaN), drop.unused.levels = FALSE)

_A_r_g_u_m_e_n_t_s:

 formula: a formula object with the cross-classifying variables
          (separated by '+') on the right hand side (or an object which
          can be coerced to a formula).  Interactions are not allowed. 
          On the left hand side, one may optionally give a vector or a
          matrix of counts; in the latter case, the columns are
          interpreted as corresponding to the levels of a variable. 
          This is useful if the data have already been tabulated, see
          the examples below.

    data: an optional matrix or data frame (or similar: see
          'model.frame') containing the variables in the formula
          'formula'.  By default the variables are taken from
          'environment(formula)'.

  subset: an optional vector specifying a subset of observations to be
          used.

  sparse: logical specifying if the result should be a _sparse_ matrix,
          i.e., inheriting from sparseMatrix.  Only works for two
          factors (since there are no higher-order sparse array classes
          yet). 

na.action: a function which indicates what should happen when the data
          contain 'NA's.

 exclude: a vector of values to be excluded when forming the set of
          levels of the classifying factors.

drop.unused.levels: a logical indicating whether to drop unused levels
          in the classifying factors.  If this is 'FALSE' and there are
          unused levels, the table will contain zero marginals, and a
          subsequent chi-squared test for independence of the factors
          will not work.

_D_e_t_a_i_l_s:

     For (non-sparse) 'xtabs' results, there is a 'summary' method for
     contingency table objects created by 'table' or 'xtabs', which
     gives basic information and performs a chi-squared test for
     independence of factors (note that the function 'chisq.test'
     currently only handles 2-d tables).

     If a left hand side is given in 'formula', its entries are simply
     summed over the cells corresponding to the right hand side; this
     also works if the lhs does not give counts.

_V_a_l_u_e:

     By default, when 'sparse=FALSE', a contingency table in array
     representation of S3 class 'c("xtabs", "table")', with a '"call"'
     attribute storing the matched call.

     When 'sparse=TRUE', a sparse numeric matrix, specifically an
     object of S4 class dgTMatrix.

_S_e_e _A_l_s_o:

     The 'stats' package version 'xtabs' and its references.

_E_x_a_m_p_l_e_s:

     ## See for non-sparse examples:
     example(xtabs, package = "stats")

     ## similar to "nlme"s  'ergoStool' :
     d.ergo <- data.frame(Type = paste("T", rep(1:4, 9*4), sep=""),
                          Subj = gl(9,4, 36*4))
     xtabs(~ Type + Subj, data=d.ergo) # 4 replicates each
     set.seed(15) # a subset of cases:
     xtabs(~ Type + Subj, data=d.ergo[sample(36, 10),], sparse=TRUE)

     ## Hypothetical two level setup:
     inner <- factor(sample(letters[1:25], 100, replace = TRUE))
     inout <- factor(sample(LETTERS[1:5], 25, replace = TRUE))
     fr <- data.frame(inner = inner, outer = inout[as.integer(inner)])
     xtabs(~ inner + outer, fr, sparse = TRUE)

