d011                 package:dblcens                 R Documentation

_C_o_m_p_u_t_e _N_P_M_L_E _o_f _C_D_F _f_r_o_m _d_o_u_b_l_y _c_e_n_s_o_r_e_d _d_a_t_a

_D_e_s_c_r_i_p_t_i_o_n:

     'd011' computes the NPMLE of CDF from doubly censored data via EM
     algorithm starting from an initial estimator that have jumps at
     uncensored points. The initial estimator also have jump at
     (mid-point of) survival times with censoring indicator  pattern of
     (0,2), (see below for definition)

     When there are ties, the left (right) censored points are treated
     as happened before (after), to break tie. Also the last right
     censored observation and first left censored observations are
     changed to uncensored, in order to obtain a proper distribution as
     estimator. (though this can be modified easily as they are written
     in R language).

     It also computes the NPMLE of the two censoring distributions.
     There is an option that you may also try to compute the  three
     influence functions.

_U_s_a_g_e:

      d011(z,d,identical=rep(0,length(z)),
              maxiter=49,error=0.00001,influence.fun=FALSE)

_A_r_g_u_m_e_n_t_s:

       z: a vector of length n denoting observed times, (ties
          permitted)

       d: a vector of length n that contains censoring indicator: d= 2
          or 1 or 0, (according to z being left, not, right censored)

identical: optional. a vector of length n that has values  either 0 or
          1. identical[i]=1 means even if $(z[i],d[i])$ is identical
          with $(z[j],d[j])$, for some $j \not= i$, they still stay as
          2  observations, (not 1 obs. with weight 2, which only happen
          if identical[i]=0 and identical[j] =0). One reason for this
          is because they may have different covariates not shown here.
          This has more flexibility for regression applications. 
          Default value is  identical = 0, (i.e. collapse if identical
          observations). 

 maxiter: optional integer value. default to 49

   error: optional. Default to 0.00001

influence.fun: optional. Default to FALSE. If TRUE, the code will try
          to compute the influence functions (3 of them) at the
          censored times. This computation can be very slow and memory
          intensive (for data with >500 censored times). 

_V_a_l_u_e:

     a list contain the NPMLE of CDF and other informations. 

    time: Times of input z, with time corresponding to status=2
          removed.

  status: Censoring status of the above times. Status = -1 means this
          is an added time because of the censoring pattern (0,2).

    surv: Survival probability at the above times.

    jump: Jumps of the NPMLE at the above times.

 exttime: Similar to times but those with status =2 not removed.

extstatus: status of exttime

 extjump: jump pf NPMLE at exttime.

extsurv.Sx: Estimated lifetime distribution.

surv0.Sy: One of the censoring distributions.

   jump0: Jump of surv0.Sy

surv2.Sz: Another censoring distribution.

   jump2: Jump of surv2.Sz

    conv: A vector of length 2: the actual number of iterations, and
          the  actual error of successive iteration. If the iteration
          number equal to  the maxiter you set, then the iteration has
          not converged.

   Nodes: Points where the influence function is computed.

   IC1tu: Influence function value at the nodes. See Chang (1990) for
          details.

  IC1tu2: Influence function value at other points. See Chang (1990)
          for details.

   IC2tu: ditto IC1tu

   IC3tu: ditto IC1tu

   VarFt: Estimated variances of F(t) at the Nodes.

_A_u_t_h_o_r(_s):

     Mai Zhou mai@ms.uky.edu, Li Lee.

_R_e_f_e_r_e_n_c_e_s:

     Chang, M. N. and Yang, G. L. (1987). Strong consistency of a
     nonparametric estimator of the survival function with doubly
     censored data. Ann. Statist. 15, 1536-1547.

     Turnbull (1976) The empirical distribution function with
     arbitrarily grouped, censored and truncated data. JRSS B 290-295.

     Chang, M. N. (1990). Weak convergence in doubly censored data.
     Ann. Statist. 18, 390-405.

     Chen, K. and Zhou, M. (2000).  Nonparametric Hypothesis Testing
     and Confidence Intervals with Doubly Censored Data. Tech Report,
     Univ. of Kentucky.

_E_x_a_m_p_l_e_s:

     d011(z=c(1,2,3,4,5), d=c(1,0,2,2,1))
     #
     # you should get something like (and more)
     #
     #       $time:
     #       [1] 1.0 2.0 2.5 5.0    (notice the times, (3,4), corresponding
     #                                   to d=2 are removed, and time 2.5 added
     #       $status:               since there is a (0,2) pattern at
     #       [1]  1  0 -1  1        times 2, 3. The status indicator of -1
     #                                   show that it is an added time )
     #       $surv
     #       [1] 0.5000351 0.5000351 0.3333177 0.0000000
     #
     #       $jump
     #       [1] 0.4999649 0.0000000 0.1667174 0.3333177
     #
     #       $exttime
     #       [1] 1.0 2.0 2.5 3.0 4.0 5.0
     #
     #       $extstatus
     #       [1]  1  0 -1  2  2  1
     #
     #       ...... 
     #
     #       $conv
     #       [1] 3.300000e+01  8.788214e-06  ### did 33 iterations
     #
     # BTW, the true NPMLE of surv is (1/2, 1/2, 1/3, 0) at times (1,2,2.5,5).
     ###### Example 2. 
     d011(c(1,2,3,4,5), c(1,2,1,0,1),influence.fun=TRUE)
     #     we get
     # ......
     #$conv:
     #[1] 3 0
     #
     #$Nodes:
     #[1] 2 4
     #
     #$IC1tu:
     #     [,1] [,2]
     #[1,]   -1    0
     #[2,]   -1   -2
     #
     #$IC2tu:
     #           [,1] [,2]
     #[1,]  0.0000000    0
     #[2,] -0.3333333    0
     #
     #$IC3tu:
     #     [,1]       [,2]
     #[1,]   -1 -0.6666667
     #[2,]   -1 -1.0000000
     #
     #$VarFt:
     #[1] 0.24 0.24           ## est var of F(t) at t=nodes
     #######################################################

