| estimator {emulator} | R Documentation |
Uses Bayesian techniques to estimate a model's prediction at each of
n datapoints. To estimate the i-th point,
conditioning variables of 1, ..., i-1
and i+1, ..., n inclusive are used (ie, all points
except point i).
This routine is useful when finding optimal coefficients for the correlation using boot methods.
estimator(val, A, d, scales=NULL, pos.def.matrix=NULL, func=regressor.basis, power=2)
val |
Design matrix with rows corresponding to points at which the function is known |
A |
Correlation matrix (note that this is not the inverse of the correlation matrix) |
d |
Vector of observations |
scales |
Scales to be used to calculate t(x). Note that
scales has no default value because estimator() is
most often used in the context of assessing the appropriateness of a
given value of scales. If the desired distance matrix
(called B in Oakley) is not diagonal, pass this matrix to
estimator() via the pos.def.matrix argument. |
pos.def.matrix |
Positive definite matrix B |
func |
Function used to determine basis vectors, defaulting
to regressor.basis if not given. |
power |
Exponent in exponential |
Given a matrix of observation points and a vector of observations,
estimator() returns a vector of predictions. Each prediction is
made in a three step process. For each index i:
d[i] is discarded, and row i and
column i deleted from A (giving A[-i,-i]).
Thus d and A are
the observation vector and correlation matrix that would have been
obtained had observation i not been available.
d[i] is estimated on the basis of the
shortened observation vector and the comatrix of A.
It is then possible to make a scatterplot of d vs dhat
where dhat=estimator(val,A,d). If the scales used are
“good”, then the points of this scatterplot will be close to
abline(0,1). The third step is to optimize the goodness of fit
of this scatterplot.
A vector of observations of the same length as d.
Robin K. S. Hankin
J. Oakley and A. O'Hagan, 2002. “Bayesian Inference for the Uncertainty Distribution of Computer Model Outputs”, Biometrika 89(4), pp769-784
R. K. S. Hankin 2005. “Introducing BACCO, an R bundle for Bayesian analysis of computer code output”, Journal of Statistical Software, 14(16)
# example has 40 observations on 6 dimensions.
# function is just sum( (1:6)*x) where x=c(x_1, ... , x_2)
val <- latin.hypercube(40,6)
colnames(val) <- letters[1:6]
d <- apply(val,1,function(x){sum((1:6)*x)})
#pick some scales:
fish <- rep(1,ncol(val))
A <- corr.matrix(val,scales=fish, power=2)
#add some suitably correlated noise:
d <- as.vector(rmvnorm(n=1, mean=d, 0.1*A))
# estimate d using the leave-out-one technique in estimator():
d.est <- estimator(val, A, d, scales=fish, power=2)
#and plot the result:
lims <- range(c(d,d.est))
par(pty="s")
plot(d, d.est, xaxs="r", yaxs="r", xlim=lims, ylim=lims)
abline(0,1)