\name{ecme}
\alias{ecme}
\title{
ECME algorithm for general linear mixed model, as
described by Schafer (1997)
}
\description{
Performs maximum-likelihood estimation for generalized linear
mixed models. The model, which is typically applied to
longitudinal or clustered responses, is

yi = Xi\%*\%beta + Zi\%*\%bi + ei ,    i=1,\dots,m,

where

yi    = (ni x 1) response vector for subject
        or cluster i;

Xi    = (ni x p) matrix of covariates;

Zi    =	(ni x q) matrix of covariates;

beta  = (p x 1) vector of coefficients common to the
        population (fixed effects);

bi    = (q x 1) vector of coefficients specific to
        subject or cluster i (random effects); and

ei    = (ni x 1) vector of residual errors.

The vector bi is assumed to be normally distributed
with mean zero and unstructured covariance matrix psi, 

bi \eqn{\sim}{~}  N(0,psi) independently for i=1,\dots,m.

The residual vector ei is assumed to be

ei \eqn{\sim}{~} N(0,sigma2*Vi)

where Vi is a known (ni x ni) matrix. In most applications,
Vi is the identity matrix.
}

\usage{
ecme(y, subj, occ, pred, xcol, zcol=NULL, vmax, start, 
     maxits=1000, eps=0.0001, random.effects=F)

}
\arguments{
\item{y}{
vector of responses. This is simply the individual yi vectors
stacked upon one another. Each element of y represents the 
observed response for a particular subject-occasion, or for a
particular unit within a cluster.
}

\item{subj}{
vector of same length as y, giving the subject (or cluster)
indicators i for the elements of y. For example, suppose 
that y is in fact c(y1,y2,y3,y4) where length(y1)=2,
length(y2)=3, length(y3)=2, and length(y4)=7. Then subj should
be c(1,1,2,2,2,3,3,4,4,4,4,4,4,4).
}

\item{occ}{
vector of same length as y indicating the "occasions" for the
elements of y. In a longitudinal dataset where each individual
is measured on at most nmax distinct occasions, each element
of y corresponds to one subject-occasion, and the elements of
of occ should be coded as 1,2,\dots,nmax to indicate these
occasion labels. (You should label the occasions as
1,2,\dots,nmax even if they are not equally spaced in time; the
actual times of measurement will be incorporated into the
matrix "pred" below.) In a clustered dataset, the elements 
of occ label the units within each cluster i, using the labels
1,2,\dots,ni.
}

\item{pred}{
matrix of covariates used to predict y. The number of rows
should be length(y). The first column will typically be
constant (one), and the remaining columns correspond to other
variables appearing in Xi and Zi.
}

\item{xcol}{
vector of integers indicating which columns of pred will be
used in Xi. That is, pred[,xcol] is the Xi matrices (stacked
upon one another.
}

\item{zcol}{
vector of integers indicating which columns of pred will be
used in Zi. That is, pred[,zcol] is the Zi matrices (stacked
upon one another). If zcol=NULL then the model is assumed
to have no random effects; in that case the parameters are
estimated noniteratively by generalized least squares.
}

\item{vmax}{
optional matrix of dimension c(max(occ),max(occ)) from which
the Vi matrices will be extracted. In a longitudinal dataset, 
vmax would represent the Vi matrix for an individual with
responses at all possible occasions 1,2,\dots,nmax=max(occ);
for individuals with responses at only a subset of these
occasions, the Vi will be obtained by extracting the rows
and columns of vmax for those occasions. If no vmax is
specified by the user, an identity matrix is used. In most
applications of this model one will want to have Vi =
identity, so most of the time this argument can be omitted.
}

\item{start}{
optional starting values of the parameters. If this argument
is not given then ecme() chooses its own starting values.
This argument should be a list of three elements named
"beta", "psi", and "sigma2". Note that "beta" should be a
vector of the same length as "xcol", "psi" should be a
matrix of dimension c(length(zcol),length(zcol)), and
"sigma2" should be a scalar. This argument has no effect if
zcol=NULL.
}

\item{maxits}{
maximum number of cycles of ECME to be performed.
The algorithm runs to convergence or until "maxits"
iterations, whichever comes first.
}

\item{eps}{
convergence criterion. The algorithm is considered to have
converged if the relative differences in all parameters from
one iteration to the next are less than eps--that is, if
all(abs(new-old)<eps*abs(old)).
}

\item{random.effects}{
if TRUE, returns empirical Bayes estimates of all the random
effects bi (i=1,2,\dots,m) and their estimated covariance
matrices.
}}

\value{
a list containing estimates of beta, sigma2, psi, an estimated
covariance matrix for beta, the number of iterations actually
performed, an indicator of whether the algorithm converged,
and a vector of loglikelihood values at each iteration. If
random.effects=T, also returns a matrix of estimated random
effects (bhat) for individuals and an array of corresponding
covariance matrices. 

\item{beta}{
vector of same length as "xcol" containing estimated fixed
effects.
}

\item{sigma2}{
estimate of error variance sigma2.
}

\item{psi}{
matrix of dimension c(length(zcol),length(zcol)) containing
the estimated covariance matrix psi.
}

\item{converged}{
T if the algorithm converged, F if it did not
}

\item{iter}{
number of iterations actually performed. Will be equal
to "maxits" if converged=F.
}

\item{loglik}{
vector of length "iter" reporting the value of the
loglikelihood at each iteration.
}

\item{cov.beta}{
matrix of dimension c(length(xcol),length(xcol)) containing
estimated variances and covariances for elements of "beta".
}

\item{bhat}{
if random.effects=T, a matrix with length(zcol) rows and
m columns, where bhat[,i] is an empirical Bayes estimate
of bi.
}

\item{cov.b}{
if random.effects=T, an array of dimension length(zcol) by
length(zcol) by m, where cov.b[,,i] is an empirical Bayes
estimate of the covariance matrix associated with bi.
}}

\references{
Schafer, J.L. (1997) Imputation of missing covariates under
a multivariate linear mixed model. Technical report, Dept. of
Statistics, The Pennsylvania State University.
}

\examples{
\dontrun{
For a detailed example, see the file "ecmeex.R" distributed
with this function.
}}

\keyword{models}

\eof
\name{pan}
\alias{pan}
\title{
Imputation of multivariate panel or cluster data
}
\description{
Gibbs sampler for the multivariate linear mixed model with
incomplete data described by Schafer (1997). This function
will typically be used to produce multiple imputations of
missing data values in multivariate panel data or clustered
data. The underlying model is

yi = Xi\%*\%beta + Zi\%*\%bi + ei,    i=1,\dots,m,

where

yi    = (ni x r) matrix of incomplete multivariate
        data for subject or cluster i;

Xi    = (ni x p) matrix of covariates;

Zi    =	(ni x q) matrix of covariates;

beta  = (p x r) matrix of coefficients common to the
        population (fixed effects);

bi    = (q x r) matrix of coefficients specific to
        subject or cluster i (random effects); and

ei    = (ni x r) matrix of residual errors.

The matrix bi, when stacked into a single column, is assumed
to be normally distributed with mean zero and unstructured
covariance matrix psi, and the rows of ei are assumed to be
independently normal with mean zero and unstructured
covariance matrix sigma. Missing values may appear in yi in
any pattern.

In most applications of this model, the first columns of Xi
and Zi will be constant (one) and Zi will contain a subset of
the columns of Xi. 
}

\usage{
pan(y, subj, pred, xcol, zcol, prior, seed, iter=1, start)
}

\arguments{
\item{y}{
matrix of responses. This is simply the individual yi matrices
stacked upon one another. Each column of y corresponds to a
response variable. Each row of y corresponds to a single
subject-occasion, or to a single subject within a cluster.
Missing values (NA) may occur in any pattern.
}

\item{subj}{
vector of length nrow(y) giving the subject (or cluster)
indicators i for the rows of y. For example, suppose 
that y is in fact rbind(y1,y2,y3,y4) where nrow(y1)=2,
nrow(y2)=3, nrow(y3)=2, and nrow(y4)=7. Then subj should
be c(1,1,2,2,2,3,3,4,4,4,4,4,4,4).
}

\item{pred}{
matrix of covariates used to predict y. This should have the
same number of rows as y. The first column will typically be
constant (one), and the remaining columns correspond to other
variables appearing in Xi and Zi.
}

\item{xcol}{
vector of integers indicating which columns of pred will be
used in Xi. That is, pred[,xcol] is the Xi matrices (stacked
upon one another).
}

\item{zcol}{
vector of integers indicating which columns of pred will be
used in Zi. That is, pred[,zcol] is the Zi matrices (stacked
upon one another).
}

\item{prior}{
a list with four components (whose names are a, Binv, c, and
Dinv, respectively) specifying the hyperparameters of the 
prior distributions for psi and sigma. For information on how
to specify and interpret these hyperparameters, see Schafer
(1997) and the example command file "panex.s" distibuted with
this package. Note: This is a slight departure from the
notation in Schafer (1997), where a and Binv were denoted
by "nu1" and "Lambdainv1", and c and Dinv were "nu2" and
"Lambdainv2".
}

\item{seed}{
integer seed for initializing pan()'s internal random number
generator. This argument should be a positive integer. 
}

\item{iter}{
total number of iterations or cycles of the Gibbs sampler
to be carried out.
}

\item{start}{
optional list of quantities to specify the initial state of
the Gibbs sampler. This list has the same form as "last"
(described below), one of the components returned by pan(). 
This argument allows the Gibbs sampler to be restarted from
the final state of a previous run. If "start" is omitted then
pan() chooses its own initial state.
}}

\value{
A list containing the following components. Note that when you
are using pan() to produce multiple imputations, you will
be primarily interested in the component "y" which contains
the imputed data; the arrays "beta", "sigma", and "psi" will
be used primarily for diagnostics (e.g. time-series plots)
to assess the convergence behavior of the Gibbs sampler.

\item{beta}{
array of dimension c(length(xcol),ncol(y),iter) = (p x r x 
number of Gibbs cycles) containing the simulated values of
beta from all cycles. That is, beta[,,T] is the (p x r) matrix of
simulated fixed effects at cycle T.
}

\item{sigma}{
array of dimension c(ncol(y),ncol(y),iter) = (r x r x
number of Gibbs cycles) containing the simulated values of
sigma from all cycles. That is, sigma[,,T] is the simulated version
of the model's sigma at cycle T.
}

\item{psi}{
array of dimension c(length(zcol)*ncol(y), length(zcol)*ncol(y), iter)
= (q*r x q*r x number of Gibbs cycles) containing the simulated values
of psi from all cycles. That is, psi[,,T] is the simulated version of
the model's psi at cycle T.
}

\item{y}{
matrix of imputed data from the final cycle of the Gibbs
sampler. Identical to the input argument y except that the
missing values (NA) have been replaced by imputed values.
If "iter" has been set large enough (which can be determined by
examining time-series plots, etc. of "beta", "sigma", and
"psi") then this is a proper draw from the posterior
predictive distribution of the complete data.
}

\item{last}{
a list of four components characterizing the final state
of the Gibbs sampler. The four components are: "beta", 
"sigma", "psi", and "y", which are the simulated values
of the corresponding model quantities from the final cycle of
Gibbs. This information is already contained in the other
components returned by pan(); we are providing this list merely
as a convenience, to allow the user to start future runs of
the Gibbs sampler at this state.
}}

\details{
The Gibbs sampler algorithm used in pan() is described in
detail by Schafer (1997).
}

\note{
This function assumes that the rows of y (and thus the rows
of subj and pred) have been sorted by subject number. That is,
we assume that subj=sort(subj), y=y[order(subj),], and
pred=pred[order(subj),]. If the matrix y is created by
stacking yi, i=1,\dots,m then this will automatically be the case.
}

\references{
Schafer, J.L. (1997) Imputation of missing covariates under
a multivariate linear mixed model. Technical report, Dept. of
Statistics, The Pennsylvania State University.
}

\examples{
\dontrun{
For a detailed example, see the file "panex.R" distributed
with this function. Here is a simple example of how pan()
might be used to produce three imputations.

# run Gibbs for 1000 cycles
result <- pan(y,subj,pred,xcol,zcol,prior,seed=9565,iter=1000)           
# first imputation
imp1 <- result$y
# another 1000 cycles
result <- pan(y,subj,pred,xcol,zcol,prior,seed=54324,iter=1000,start=result$last)
# second imputation
imp2 <- result$y
# another 1000 cycles
result <- pan(y,subj,pred,xcol,zcol,prior,seed=698212,iter=1000,start=result$last)
# third imputation
imp3 <- result$y
}}

\keyword{models}

% Converted by Sd2Rd version 1.21.

\eof
\name{pan.bd}
\alias{pan.bd}

\title{
Imputation of multivariate panel or cluster data
}

\description{
Implementation of pan() that restricts the covariance matrix
for the random effects to be block-diagonal. This function
is identical to pan() in every way except that psi is now 
characterized by a set of r matrices of dimension q x q.
}

\usage{
pan.bd(y, subj, pred, xcol, zcol, prior, seed, iter=1, start)
}

\arguments{
\item{y}{
See description for pan().
}

\item{subj}{
See description for pan().
}

\item{pred}{
See description for pan().
}

\item{xcol}{
See description for pan().
}

\item{zcol}{
See description for pan().
}

\item{prior}{
Same as for pan() except that the hyperparameters for psi
have new dimensions. The hyperparameter c is now a vector of
length r, where c[j] contains the prior degrees of freedom for
the jth block portion of psi (j=1,\dots,r). The hyperparameter
Dinv is now an array of dimension c(q,q,r), where Dinv[,,j]
contains the prior scale matrix for the jth block portion of
psi (j=1,\dots,r).
}

\item{seed}{
See description for pan().
}

\item{iter}{
See description for pan().
}

\item{start}{
See description for pan().
}}

\value{
A list with the same components as that from pan(), with two
minor differences: the dimension of "psi" is now (q x q x r x
"iter"), and the dimension of "last\$psi" is now (q x q x r).
}

\keyword{models}


\eof
