Lexis                  package:Epi                  R Documentation

_S_p_l_i_t _f_o_l_l_o_w-_u_p _t_i_m_e _i_n _c_o_h_o_r_t _s_t_u_d_i_e_s.

_D_e_s_c_r_i_p_t_i_o_n:

     For cohort input data the follow-up time is chopped in pieces
     along several time scales, and a dataframe of follow-up intervals
     is returned. Entry and exit times are assumed to be in the same
     timescale (the input time scale).

_U_s_a_g_e:

     Lexis( entry = 0,
             exit,
             fail,
           origin = 0,
            scale = 1,
           breaks,
          include = NULL,
             data = NULL )

_A_r_g_u_m_e_n_t_s:

   entry: Date of entry on the input timescale. Numerical variable.

    exit: Date of exit on the input timescale. Numerical variable.

    fail: Failure indicator.

  origin: Origin of the output timescale(s) on the input timescale. If
          for example the input timescale is calendar time and the
          output timescale is (current) age, the the origin is date of
          birth. If more than one timescale is used for splitting time
          this is a list. Elements of the list must be named and must
          have the same names as those in 'scale' and 'breaks'.

   scale: Scale of the output timescale(s) relative to the input
          timescale. Elements of the list must be named and have the
          same names as those in 'origin' and 'breaks'.

  breaks: Points on the output scale where the follow-up is cut. If
          more than one timescale is used for splitting time this is a
          list. Elements of the list must be named and must have the
          same names as those in 'origin' and 'scale'.

 include: List of variables to carry unchanged from the original
          dataframe to the output dataframe.

    data: Dataframe in which to interpret the arguments.

_D_e_t_a_i_l_s:

     The 'data' is assumed to be a dataframe describing the follow-up
     of a cohort, giving entry and exit time (on the input timescale)
     for each individual as well as the exit status (failure status,
     'fail'). The purpose of the function is to split each individual's
     follow-up time along a number of timescales for example age,
     calendar time, time since entry etc. Any follow-up time before the
     first break on any timescale or after the last break on any of
     these timescales (the output timescales) is discarded.

     NOTE: If a person has his/her exit before the first break or his
     entry after the last break on any of the timescales the function
     will crash.

_V_a_l_u_e:

     A dataframe with one row per follow-up interval, with the
     following variables: 

  Expand: A numerical vector with values in '1:nrows(data)', pointing
          at the rows of the input data frame that is expanded to the
          output intervals.

   Entry: Date of entry for each interval. On the input time scale.

    Exit: Date of exit for each interval. On the input time scale.

    Fail: Exit status for each interval. Coded 0 for censoring, for the
          last follow-up interval for each person it takes the value of
          'fail'.

    Time: If 'origin', 'scale' or 'breaks' were given as vectors this
          gives the left endpoints of the intervals on the output
          scale.

          If 'origin', 'scale' or 'breaks', were given as lists, there
          is no variable 'Time' in the dataframe, instead variables
          with the same names as the list elements of these will be in
          the dataframe. The variables have values corresponding to the
          left endpoints of the intervals on the respective output time
          scales.

        : 

     Finally, variables given in the argument 'include', values
     replicated across all intervals from each individual.

_A_u_t_h_o_r(_s):

     David Clayton, approx. 2000. Small modifications by Bendix
     Carstensen.

_R_e_f_e_r_e_n_c_e_s:

     This function has approximately the same functionality as
     'stsplit' in Stata and the SAS-macro '%Lexis' (<URL:
     http://www.biostat.ku.dk/~bxc/Lexis/Lexis.sas>). It has been
     attempted to keep argument names similar between the three
     functions.

_S_e_e _A_l_s_o:

     'Lexis.diagram'

_E_x_a_m_p_l_e_s:

     # A small bogus cohort
     #
     xcoh <- structure( list( id = c("A", "B", "C"),
                           birth = c("14/07/1952", "01/04/1954", "10/06/1987"),
                           entry = c("04/08/1965", "08/09/1972", "23/12/1991"),
                            exit = c("27/06/1997", "23/05/1995", "24/07/1998"),
                            fail = c(1, 0, 1) ),
                          .Names = c("id", "birth", "entry", "exit", "fail"),
                       row.names = c("1", "2", "3"),
                           class = "data.frame" )

     # Convert the character dates into numerical variables (fractional years)
     #
     xcoh$bt <- cal.yr( xcoh$birth, format="%d/%m/%Y" )
     xcoh$en <- cal.yr( xcoh$entry, format="%d/%m/%Y" )
     xcoh$ex <- cal.yr( xcoh$exit , format="%d/%m/%Y" )

     # See how it looks
     #
     xcoh 

     # Split time along one time-axis
     #
     Lexis( entry = en,
             exit = ex,
             fail = fail,
            scale = 1,
           origin = bt,
           breaks = seq( 5, 40, 5 ),
          include = list( bt, en, ex, id ),
             data = xcoh )

     # Split time along two time-axes
     #
     ( x2 <- 
     Lexis( entry = en,
             exit = ex,
             fail = fail,
            scale = 1,
           origin = list( per=0,                 age=bt          ),
           breaks = list( per=seq(1900,2000,10), age=seq(0,80,5) ),
          include = list( bt, en, ex, id ),
             data = xcoh ) )

     # Tabulate the cases and the person-years
     #
     tapply( x2$Fail, list( x2$age, x2$per ), sum )
     tapply( x2$Exit - x2$Entry, list( x2$age, x2$per ), sum )

