el.cen.EM               package:emplik               R Documentation

_E_m_p_i_r_i_c_a_l _l_i_k_e_l_i_h_o_o_d _r_a_t_i_o _f_o_r _m_e_a_n 
_w_i_t_h _r_i_g_h_t, _l_e_f_t _o_r _d_o_u_b_l_y _c_e_n_s_o_r_e_d _d_a_t_a, _b_y _E_M _a_l_g_o_r_i_t_h_m

_D_e_s_c_r_i_p_t_i_o_n:

     This program uses EM algorithm to compute the maximized  (wrt p_i)
     empirical log likelihood function for right, left or doubly
     censored data with  the MEAN constraint:

          sum_{d_i=1}  p_i f(x_i)  = int f(t) dF(t) = mu ~.

     Where p_i = Delta F(x_i) is a probability, d_i is the censoring
     indicator, 1(uncensored), 0(right censored), 2(left censored).  It
     also returns those p_i. 

     The empirical log likelihood been maximized is

 sum_{d_i=1} log Delta F(x_i) + sum_{d_i=0} log [1-F(x_i)]  + sum_{d_i=2}  log F(x_i) .

_U_s_a_g_e:

     el.cen.EM(x,d,fun=function(t){t},mu,maxit=25,error=1e-9,...)

_A_r_g_u_m_e_n_t_s:

       x: a vector containing the observed survival times.

       d: a vector containing the censoring indicators,  1-uncensored;
          0-right censored; 2-left censored.

     fun: a continuous (weight) function used to calculate the mean as
          in H_0. 'fun(t)' must be able to take a vector input 't'.
          Default to the identity function f(t)=t.

      mu: a real number used in the constraint, mean value of f(X).

   maxit: an optional integer, used to control maximum number of
          iterations. 

   error: an optional positive real number specifying the tolerance of
          iteration error. This is the bound of the L_1 norm of the
          difference of two successive weights.

     ...: additional arguments, if any, to pass to fun.

_D_e_t_a_i_l_s:

     This implementation is all in R and have several for-loops in it. 
     A faster version would use C to do the for-loop part. But this
     version seems faster enough and is easier to port to Splus.

     We return the log likelihood all the time. Sometimes, (for right
     censored and no censor case) we also return the -2 log likelihood
     ratio. In other cases, you have to plot a curve with many values
     of the  parameter, mu, to find out where is the place the log
     likelihood becomes maximum. And from there you can get -2 log
     likelihood ratio between the maximum location and your current
     parameter in Ho.

     In order to get a proper distribution as NPMLE, we automatically
     change the d for the largest observation to 1 (even if it is right
     censored), similar for the left censored,  smallest observation.
     mu is a given constant.  When the given constants mu is too far
     away from the NPMLE, there will be no distribution satisfy the
     constraint. In this case the computation will stop. The -2 Log
     empirical likelihood ratio should be infinite. 

     The constant 'mu' must be inside  ( min f(x_i) , max f(x_i) )  for
     the computation to continue.  It is always true that the NPMLE
     values are feasible. So when the computation stops, try move the
     'mu' closer to the NPMLE - 

                       sum_{d_i=1} p_i^0 f(x_i)

     p_i^0 taken to be the jumps of the NPMLE of CDF.  Or use a
     different 'fun'.

_V_a_l_u_e:

     A list with the following components: 

  loglik: the maximized empirical log likelihood under the constraint.

   times: locations of CDF that have positive mass.

    prob: the jump size of CDF at those locations.

 "-2LLR": If available, it is Minus two times the  Empirical Log
          Likelihood Ratio. Should be approx. chi-square distributed
          under Ho.

    Pval: The P-value of the test, using chi-square approximation.

_A_u_t_h_o_r(_s):

     Mai Zhou

_R_e_f_e_r_e_n_c_e_s:

     Zhou, M. (2002).  Computing censored empirical likelihood ratio 
     by EM algorithm.  _Tech Report, Univ. of Kentucky, Dept of
     Statistics_

     Murphy, S. and van der Varrt (1997) Semiparametric likelihood
     ratio inference. _Ann. Statist._ * 25*, 1471-1509.

_E_x_a_m_p_l_e_s:

     ## example with tied observations
     x <- c(1, 1.5, 2, 3, 4, 5, 6, 5, 4, 1, 2, 4.5)
     d <- c(1,   1, 0, 1, 0, 1, 1, 1, 1, 0, 0,   1)
     el.cen.EM(x,d,mu=3.5)
     ## we should get "-2LLR" = 1.2466....
     myfun5 <- function(x, theta, eps) {
     u <- (x-theta)*sqrt(5)/eps 
     INDE <- (u < sqrt(5)) & (u > -sqrt(5)) 
     u[u >= sqrt(5)] <- 0 
     u[u <= -sqrt(5)] <- 1 
     y <- 0.5 - (u - (u)^3/15)*3/(4*sqrt(5)) 
     u[ INDE ] <- y[ INDE ] 
     return(u)
     }
     el.cen.EM(x, d, fun=myfun5, mu=0.5, theta=3.5, eps=0.1)

