TRAMPknowns              package:TRAMPR              R Documentation

_T_R_A_M_P_k_n_o_w_n_s _O_b_j_e_c_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     These functions create and interact with 'TRAMPknowns' objects
     (collections of known TRFLP patterns).  Knowns contrast with
     "samples" (see 'TRAMPsamples') in that knowns contain identified
     profiles, while samples contain unidentified profiles.  Knows must
     have at most one peak per enzyme/primer combination (see Details).

_U_s_a_g_e:

     TRAMPknowns(data, info, cluster.pars=list(), file.pat=NULL,
                 warn.factors=TRUE, ...)

     ## S3 method for class 'TRAMPknowns':
     labels(object, ...)
     ## S3 method for class 'TRAMPknowns':
     summary(object, include.info=FALSE, ...)

_A_r_g_u_m_e_n_t_s:

    data: data.frame containing peak information.

    info: data.frame, describing individual samples (see Details for
          definitions of both data.frames).

cluster.pars: Parameters used when clustering the knowns database.  See
          Details.

file.pat: Optional partial filename in which to store knowns database
          after modification.  Files '<file.pat>_info.csv' and
          '<file.pat>_data.csv' will be created.

warn.factors: Logical: Should a warning be given if any columns in
          'info' or 'data' are converted into factors?

  object: A 'TRAMPknowns' object.

include.info: Logical: Should the output be augmented with the contents
          of the 'info' component of the 'TRAMPknowns' object?

     ...: 'TRAMPknowns': Additional objects to incorportate into a
          'TRAMPknowns' object.  Other methods: Further arguments
          passed to or from other methods.

_D_e_t_a_i_l_s:

     The object has at least two components, which relate to each other
     (in the sense of a relational database).  'info' holds information
     about the individual samples, and 'data' holds information about
     individual peaks (many of which may belong to a single sample).

     Column definitions:

        *  'info': \describe {

   '_k_n_o_w_n_s._p_k': Unique positive integer, used to identify individual
        knowns (i.e. a "primary key").

   '_s_p_e_c_i_e_s': Character, giving species name. }

        *  'data':

        '_k_n_o_w_n_s._f_k': Positive integer, indicating which sample the peak
             belongs to (by matching against 'info$knowns.pk') (i.e. a
             "foreign key").

        '_p_r_i_m_e_r': Character, giving the name of the primer used.

        '_e_n_z_y_m_e': Character, giving the name of the restriction digest
             enzyme used.

        '_s_i_z_e': Numeric, giving size (in base pairs) of the peak.

     In addition, 'TRAMPknowns' will create additional columns holding
     clustering information (see 'group.knowns').  Additional columns
     are allowed (and retained, but ignored) in both data.frames.
     Additional objects are allowed as part of the 'TRAMPknowns'
     object, but these will not be written by 'write.TRAMPknowns'; any
     extra objects passed (via '...') will be included in the final
     'TRAMPknowns' object.

     The 'cluster.pars' argument controls how knowns will be clustered
     (this will happen automatically as needed).  Elements of the list
     'cluster.pars' may be any of the three arguments to
     'group.knowns', and will be used as defaults in subsequent calls
     to 'group.knowns'.  If not provided, default values are:
     'dist.method="maximum"', 'hclust.method="complete"',
     'cut.height=2.5' (if only some elements of 'cluster.pars' are
     provided, the remaining elements default to the values above).  To
     change values of clustering parameters in an existing
     'TRAMPknowns' object, use 'group.knowns'.

     A known contains at most one peak per enzyme/primer combination.
     Where a species is known to have multiple TRFLP profiles, these
     should be treated as separate knowns with different, unique,
     'knowns.pk' values, but with identical 'species' values.  A sample
     containing either pattern will then be recorded as having that
     species present (see 'group.knowns').

_V_a_l_u_e:

TRAMPknowns: A new 'TRAMPknowns' object: a list with components 'info',
          'data' (the provided data.frames, with clustering information
          added to 'info'), 'cluster.pars' and 'file.pat', plus any
          extra objects passed as '...'.

labels.TRAMPknowns: A sorted vector of the unique samples present in
          'x' (from 'info$knowns.pk').

summary.TRAMPknowns: A data.frame, with the size of the peak (if
          present) for each enzyme/primer combination, with each known
          (indicated by 'knowns.pk') as rows and each combination (in
          the format '<primer>_<enzyme>') as columns.

_N_o_t_e:

     Across a 'TRAMPknowns' object, primer and enzyme names must be
     _exactly_ the same (including case and whitespace) to be
     considered the same.  For example '"ITS4"', '"Its4"', '"ITS 4"'
     and '"ITS4 "' would be considered to be four different primers.

     Factors will not merge correctly (with 'combine.TRAMPknowns' or
     'add.known'). 'TRAMPknowns' will attempt to catch factor columns
     and convert them into characters for the 'info' and 'data'
     data.frames. Other objects (passed as part of '...') will not be
     altered.

_S_e_e _A_l_s_o:

     'TRAMPsamples', which constructs an analagous object to hold
     "samples" data.

     'plot.TRAMPknowns', which creates a graphical representation of
     the knowns data.

     'TRAMP', for matching unknown TRFLP patterns to 'TRAMPknowns'
     objects.

     'group.knowns', which groups similar knowns (generally called
     automatically).

     'add.known' and 'combine.TRAMPknowns', which provide tools for
     adding knowns from a sample data set and merging knowns databases.

_E_x_a_m_p_l_e_s:

     ## This example builds a TRAMPknowns object from completely artificial
     ## data:

     ## The info data.frame:
     knowns.info <-
       data.frame(knowns.pk=1:8,
                  species=rep(paste("Species", letters[1:5]), length=8))
     knowns.info

     ## The data data.frame:
     knowns.data <- expand.grid(knowns.fk=1:8,
                                primer=c("ITS1F", "ITS4"),
                                enzyme=c("BsuRI", "HpyCH4IV"))
     knowns.data$size <- runif(nrow(knowns.data), min=40, max=800)

     ## Construct the TRAMPknowns object:
     demo.knowns <- TRAMPknowns(knowns.data, knowns.info, warn.factors=FALSE)

     ## A plot of the pretend knowns:
     plot(demo.knowns, cex=1, group.clusters=TRUE)

