| betabinomial {VGAM} | R Documentation |
Fits a beta-binomial distribution by maximum likelihood estimation. The two parameters here are the mean and correlation coefficient.
betabinomial(lmu="logit", lrho="logit", emu=list(), erho=list(),
irho=NULL, method.init=1, shrinkage.init=0.95,
nsimEIM=NULL, zero=2)
lmu, lrho |
Link functions applied to the two parameters.
See Links for more choices.
The defaults ensure the parameters remain in (0,1),
however, see the warning below.
|
emu, erho |
List. Extra argument for each of the links.
See earg in Links for general information.
|
irho |
Optional initial value for the correlation parameter.
If given, it must be in (0,1), and is recyled to the necessary
length. Assign this argument a value if a convergence failure occurs.
Having irho=NULL means an initial value is obtained internally,
though this can give unsatisfactory results.
|
method.init |
An integer with value 1 or 2 or ...,
which specifies the initialization method for mu.
If failure to converge occurs try the another value
and/or else specify a value for irho.
|
zero |
An integer specifying which
linear/additive predictor is to be modelled as an intercept only.
If assigned, the single value should be either 1 or 2.
The default is to have a single correlation parameter.
To model both parameters as functions of the covariates assign
zero=NULL.
See CommonVGAMffArguments for more information.
|
shrinkage.init, nsimEIM |
See CommonVGAMffArguments for more information.
The argument shrinkage.init is used only if method.init=2.
Using the argument nsimEIM may offer large advantages for large
values of N and/or large data sets.
|
There are several parameterizations of the beta-binomial distribution.
This family function directly models the mean and correlation
parameter, i.e.,
the probability of success.
The model can be written
T|P=p ~ Binomial(N,p)
where P has a beta distribution with shape parameters
alpha and beta. Here,
N is the number of trials (e.g., litter size),
T=NY is the number of successes, and
p is the probability of a success (e.g., a malformation).
That is, Y is the proportion of successes. Like
binomialff, the fitted values are the
estimated probability
of success (i.e., E[Y] and not E[T])
and the prior weights N are attached separately on the
object in a slot.
The probability function is
P(T=t) = choose(N,t) B(alpha+t, beta+N-t) / B(alpha, beta)
where t=0,1,...,N, and B is the
beta function
with shape parameters alpha and beta.
Recall Y = T/N is the real response being modelled.
The default model is eta1 =logit(mu) and eta2 = logit(rho) because both parameters lie between 0 and 1. The mean (of Y) is p = mu = alpha / (alpha + beta) and the variance (of Y) is mu(1-mu)(1+(N-1)rho)/N. Here, the correlation rho is given by 1/(1 + alpha + beta) and is the correlation between the N individuals within a litter. A litter effect is typically reflected by a positive value of rho. It is known as the over-dispersion parameter.
This family function uses Fisher scoring. Elements of the second-order expected derivatives with respect to alpha and beta are computed numerically, which may fail for large alpha, beta, N or else take a long time.
An object of class "vglmff" (see vglmff-class).
The object is used by modelling functions such as vglm.
Suppose fit is a fitted beta-binomial model. Then
fit@y contains the sample proportions y,
fitted(fit) returns estimates of E(Y), and
weights(fit, type="prior") returns the number
of trials N.
If the estimated rho parameter is close to zero then it pays to try
lrho="rhobit". One day this may become the default link function.
This family function is prone to numerical difficulties
due to the expected information matrices not being positive-definite
or ill-conditioned over some regions of the parameter space.
If problems occur try setting irho to some numerical value,
nsimEIM=100, say,
or else use etastart argument of
vglm, etc.
This function processes the input in the same way
as binomialff. But it does not handle
the case N=1 very well because there are two
parameters to estimate, not one, for each row of the input.
Cases where N=1 can be omitted via the
subset argument of vglm.
The extended beta-binomial distribution of Prentice (1986)
is currently not implemented in the VGAM package as it has
range-restrictions for the correlation parameter that are currently
too difficult to handle in this package.
However, try lrho="rhobit".
T. W. Yee
Moore, D. F. and Tsiatis, A. (1991) Robust estimation of the variance in moment methods for extra-binomial and extra-Poisson variation. Biometrics, 47, 383–401.
Prentice, R. L. (1986) Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81, 321–327.
betabin.ab,
Betabin,
binomialff,
betaff,
dirmultinomial,
lirat.
# Example 1
N = 10; mu = 0.5; rho = 0.8
y = rbetabin(n=100, size=N, prob=mu, rho=rho)
fit = vglm(cbind(y,N-y) ~ 1, betabinomial, trace=TRUE)
coef(fit, matrix=TRUE)
Coef(fit)
head(cbind(fit@y, weights(fit, type="prior")))
# Example 2
fit = vglm(cbind(R,N-R) ~ 1, betabinomial, data=lirat,
trace=TRUE, subset=N>1)
coef(fit, matrix=TRUE)
Coef(fit)
t(fitted(fit))
t(fit@y)
t(weights(fit, type="prior"))
# Example 3, which is more complicated
lirat = transform(lirat, fgrp = factor(grp))
summary(lirat) # Only 5 litters in group 3
fit2 = vglm(cbind(R,N-R) ~ fgrp + hb, betabinomial(zero=2),
data=lirat, trace=TRUE, subset=N>1)
coef(fit2, matrix=TRUE)
## Not run:
with(lirat, plot(hb[N>1], fit2@misc$rho,
xlab="Hemoglobin", ylab="Estimated rho",
pch=as.character(grp[N>1]), col=grp[N>1]))
## End(Not run)
## Not run:
# cf. Figure 3 of Moore and Tsiatis (1991)
with(lirat, plot(hb, R/N, pch=as.character(grp), col=grp, las=1,
xlab="Hemoglobin level", ylab="Proportion Dead",
main="Fitted values (lines)"))
smalldf = with(lirat, lirat[N>1,])
for(gp in 1:4) {
xx = with(smalldf, hb[grp==gp])
yy = with(smalldf, fitted(fit2)[grp==gp])
o = order(xx)
lines(xx[o], yy[o], col=gp)
}
## End(Not run)