readTable              package:R.utils              R Documentation

_R_e_a_d_s _a _f_i_l_e _i_n _t_a_b_l_e _f_o_r_m_a_t

_D_e_s_c_r_i_p_t_i_o_n:

     Reads a file in table format and creates a data frame from it,
     with cases corresponding to lines and variables to fields in the
     file.

     _WARNING: This method is very much in an alpha stage. Expect it to
     change._

     This method is an extension to the default 'read.table'() function
     in R.  It is possible to specify a column name to column class map
     such that the column classes are automatically assigned from the
     column header in the file.

     In addition, it is possible to read any subset of rows. The method
     is optimized such that only columns and rows that are of interest
     are parsed and read into R's memory.  This minimizes memory usage
     at the same time as it speeds up the reading.

_U_s_a_g_e:

     ## Default S3 method:
     readTable(file, colClasses=NULL, isPatterns=FALSE, defColClass=NA, header=FALSE, skip=0, nrows=-1, rows=NULL, col.names=NULL, check.names=FALSE, path=NULL, ..., stripQuotes=TRUE, method=c("readLines", "intervals"), verbose=FALSE)

_A_r_g_u_m_e_n_t_s:

    file: A 'connection' or a filename.  If a filename, the path
          specified by 'path' is added to the front of the filename. 
          Unopened files are opened and closed at the end.

colClasses: Either a named or an unnamed 'character' 'vector'. If
          unnamed, it specified the column classes just as used by
          'read.table'(). If it is a named vector, 'names(colClasses)'
          are used to match the column names read (this requires that
          'header=TRUE') and the column classes are set to the
          corresponding values. 

isPatterns: If 'TRUE', the matching of 'names(colClasses)' to the read
          column names is done by regular expressions matching.

defColClass: If the column class map specified by a named 'colClasses'
          argument does not match some of the read column names, the
          column class is by default set to this class. The default is
          to read the columns in an "as is" way.

  header: If 'TRUE', column names are read from the file.

    skip: The number of lines (commented or non-commented) to skip
          before trying to read the header or alternatively the data
          table.

   nrows: The number of rows to read of the data table. Ignored if
          'rows' is specified.

    rows: An row index 'vector' specifying which rows of the table to
          read, e.g. row one is the row following the header.
          Non-existing rows are ignored.  Note that rows are returned
          in the same order they are requested and duplicated rows are
          also returned.

col.names: Same as in 'read.table()'.

check.names: Same as in 'read.table()', but default value is 'FALSE'
          here.

    path: If 'file' is a filename, this path is added to it, otherwise
          ignored.

     ...: Arguments passed to 'read.table'() used internally.

stripQuotes: If 'TRUE', quotes are stripped from values before being
          parse. This argument is only effective when
          'method=="readLines"'. 

  method: If '"readLines"', '(readLines())' is used internally to first
          only read rows of interest, which is then passed to
          'read.table()'. If '"intervals"', contigous intervals are
          first identified in the rows of interest.  These intervals
          are the read one by one using 'read.table()'. The latter
          methods is faster and especially more memory efficient if the
          intervals are not too many, where as the former is prefered
          if many "scattered" rows are to be read.

 verbose: A 'logical' or a 'Verbose' object.

_V_a_l_u_e:

     Returns a 'data.frame'.

_A_u_t_h_o_r(_s):

     Henrik Bengtsson (<URL: http://www.braju.com/R/>)

_S_e_e _A_l_s_o:

     'readTableIndex'(). 'read.table'().

