mcgibbsit             package:mcgibbsit             R Documentation

_W_a_r_n_e_s _a_n_d _R_a_f_t_e_r_y'_s _M_C_G_i_b_b_s_i_t _M_C_M_C _d_i_a_g_n_o_s_t_i_c

_D_e_s_c_r_i_p_t_i_o_n:

     mcgibbsit provides an implementation of Warnes & Raftery's
     MCGibbsit run-length diagnostic for a set of (not-necessarily
     independent) MCMC samplers.  It combines the estimate
     error-bounding approach of Raftery and Lewis with the between
     chain variance verses within chain variance approach of Gelman and
     Rubin.

_U_s_a_g_e:

     mcgibbsit(data, q=0.025, r=0.0125, s=0.95, converge.eps=0.001,
               correct.cor=TRUE)

_A_r_g_u_m_e_n_t_s:

    data: an `mcmc' object.

       q: quantile(s) to be estimated.

       r: the desired margin of error of the estimate.

       s: the probability of obtaining an estimate in the interval

converge.eps: Precision required for estimate of time to convergence.

correct.cor: should the between-chain correlation correction (R) be
          computed and applied.  Set to false for independent MCMC
          chains.

_D_e_t_a_i_l_s:

     'mcgibbsit' computes the minimum run length Nmin, required burn in
     M, total run length N, run length inflation due to
     _auto-correlation_, I, and the run length inflation due to
     _between-chain_ correlation, R for a set of exchangeable MCMC
     simulations which need not be independent.

     The normal usage is to perform an initial MCMC run of some
     pre-determined length (e.g., 300 iterations) for each of a set of
     k (e.g., k=20) MCMC samplers.  The output from these samplers is
     then read in to create an 'mcmc.list' object and 'mcgibbsit' is
     run specifying the desired accuracy of estimation for quantiles of
     interest.  This will return the minimum number of iterations to
     achieve the specified error bound.  The set of MCMC samplers is
     now run so that the total number of iterations exceeds this
     minimum, and 'mcgibbsit' is again called.  This should continue
     until the number of iterations already complete is less than the
     minimum number computed by 'mcgibbsit'.

     If the initial number of iterations in 'data' is too small to
     perform the calculations, an error message is printed indicating
     the minimum pilot run length.

     The parameters 'q', 'r', 's', 'converge.eps', and 'correct.cor'
     can be supplied as vectors.  This will cause 'mcgibbsit' to
     produce a list of results, with one element produced for each set
     of values.  I.e., setting 'q=(0.025,0.975), r=(0.0125,0.005)' will
     yield a list containing two 'mcgibbsit' objects, one computed with
     parameters 'q=0.025, r=0.0125', and the other with 'q=0.975,
     r=0.005'.

_V_a_l_u_e:

     An 'mcgibbsit' object with components

    call: parameters used to call 'mcgibbsit'

  params: values of r, s, and q used

resmatrix: a matrix with 6 columns:

          _N_m_i_n The minimum required sample size for a chain with no 
               correlation between consecutive samples. Positive 
               autocorrelation will increase the required sample size
               above this minimum value.

          _M The number of `burn in' iterations to be discarded  (total
               over all chains).

          _N The number of iterations after burn in required to estimate
               the quantile q to within an accuracy of +/- r with
               probability p (total over all chains).

          _T_o_t_a_l Overall number of iterations required (M + N).

          _I An estimate (the `dependence factor') of the extent to
               which auto-correlation inflates the required sample
               size.  Values of `I' larger than 5 indicate strong
               autocorrelation which may be due to a poor choice of
               starting value, high posterior correlations, or
               `stickiness' of the MCMC algorithm.

          _R An estimate of the extent to which between-chain
               correlation inflates the required sample size.  Large
               values of 'R' indicate that there is significant
               correlation between the chains and may be indicative of
               a lack of convergence or a poor multi-chain algorithm.

 nchains: the number of MCMC chains in the data

     len: the length of each chain

_A_u_t_h_o_r(_s):

     Gregory R. Warnes gregory_r_warnes@groton.pfizer.com based on the
     the R function 'raftery.diag' which is part of the 'CODA' library.
      'raftery.diag', in turn, is based on the FORTRAN program
     `gibbsit' written by Steven Lewis which is available from the
     Statlib archive.

_R_e_f_e_r_e_n_c_e_s:

     Warnes, G.W. (2004). The Normal Kernel Coupler: An adaptive MCMC
     method for efficiently sampling from multi-modal distributions
     (web site), <URL:
     http://www.analytics.washington.edu/Zope/projects/MCMC/NKC/index.h
     tml>.

     Warnes, G.W. (2000).  Multi-Chain and Parallel Algorithms for
     Markov Chain Monte Carlo. Dissertation, Department of
     Biostatistics, University of Washington,

     Raftery, A.E. and Lewis, S.M. (1992).  One long run with
     diagnostics: Implementation strategies for Markov chain Monte
     Carlo. Statistical Science, 7, 493-497.

     Raftery, A.E. and Lewis, S.M. (1995).  The number of iterations,
     convergence diagnostics and generic Metropolis algorithms.  In
     Practical Markov Chain Monte Carlo (W.R. Gilks, D.J. Spiegelhalter
     and S. Richardson, eds.). London, U.K.: Chapman and Hall.

_S_e_e _A_l_s_o:

     'read.mcmc'

_E_x_a_m_p_l_e_s:

     # this is a totally useless example, but it does exercise the code
     for(i in 1:20){
       x <- matrix(rnorm(1000),ncol=4)
       x[,4] <- x[,4] + 1/3 * (x[,1] + x[,2] + x[,3])
       colnames(x) <- c("alpha","beta","gamma", "nu")
       write.table(x, file=paste("mcmc",i,"csv",sep="."), sep=",")
     }

     data <- read.mcmc(20, "mcmc.#.csv", sep=",")

     mcgibbsit(data)

