classify               package:TRAMPR               R Documentation

_V_a_l_u_e _M_a_t_c_h_i_n_g _f_o_r _D_a_t_a _F_r_a_m_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     'match'-like classification for data.frames; returns a vector of
     row numbers of (first) matches of its first argument in its
     second, across shared column names.  This is unlikely to be useful
     to casual 'TRAMP' users, but see the final example for a relevant
     usage.

_U_s_a_g_e:

     classify(x, table, ...)

_A_r_g_u_m_e_n_t_s:

       x: data.frame: containing columns with the values to be matched.

   table: data.frame: where all columns contain the values to be
          matched against.

     ...: Additional arguments to 'match' (see especially 'nomatch').

_D_e_t_a_i_l_s:

     As with 'duplicated.data.frame', this works by pasting together a
     character representation of the rows separated by '\r' (a carriage
     return), so may be imperfect if the data.frame has characters with
     embedded carriage returns or columns which do not reliably map to
     characters.

     Cases in 'x' with 'NA' values in any column shared with 'table'
     will not be matched (and will return the value of 'nomatch'). 
     Cases in 'table' with any 'NA' values in any row will match
     nothing.

     All columns in 'table' must be present in 'x', but 'x' may have
     additional columns that will be ignored.

_V_a_l_u_e:

     A vector of length 'nrow(x)', with each element giving the row
     number in 'table' where all elements match across shared columns.

_S_e_e _A_l_s_o:

     'match', on which this is based.

_E_x_a_m_p_l_e_s:

     table <- data.frame(a=letters[1:3], b=rep(1:2, each=3))
     x <- cbind(table[sample(nrow(table), 20, TRUE),], x=runif(20))

     classify(x, table)
     all.equal(table[classify(x, table),], x[names(table)])

     ## Select only a few cases from a TRAMPsamples data object,
     ## corresponding with 4 enzyme/primer combinations.
     data(demo.samples)
     d <- demo.samples$data
     use <- expand.grid(primer=c("ITS1F", "ITS4"),
                        enzyme=c("HpyCH4IV", "BsuRI"))
     classify(d, use)
     d[!is.na(classify(d, use)),]

