estimator              package:emulator              R Documentation

_E_s_t_i_m_a_t_e_s _e_a_c_h _k_n_o_w_n _d_a_t_a_p_o_i_n_t _u_s_i_n_g _t_h_e _o_t_h_e_r_s _a_s _d_a_t_a_p_o_i_n_t_s

_D_e_s_c_r_i_p_t_i_o_n:

     Uses Bayesian techniques to estimate a model's prediction at each
     of 'n' datapoints.  To estimate the i-th point, conditioning
     variables of 1, ..., i-1 and i+1, ..., n inclusive are used (ie,
     all points except point i).

     This routine is useful when finding optimal coefficients for the
     correlation using boot methods.

_U_s_a_g_e:

     estimator(val, A, d, scales=NULL, pos.def.matrix=NULL,
     func=regressor.basis, power=2)

_A_r_g_u_m_e_n_t_s:

     val: Design matrix with rows corresponding to points at which the
          function is known

       A: Correlation matrix (note that this is *not* the inverse of
          the correlation matrix)

       d: Vector of observations

  scales: Scales to be used to calculate 't(x)'.  Note that 'scales'
          has no default value because 'estimator()' is most often used
          in the context of assessing the appropriateness of a given
          value of 'scales'.  If the desired distance matrix (called B
          in Oakley) is not diagonal, pass this matrix to 'estimator()'
          via the 'pos.def.matrix' argument.

pos.def.matrix: Positive definite matrix B

    func: Function used to determine basis vectors, defaulting to
          'regressor.basis' if not given.

   power: Exponent in exponential

_D_e_t_a_i_l_s:

     Given a matrix of observation points and a vector of observations,
     'estimator()' returns a vector of predictions.  Each prediction is
     made in a three step process.  For each index i:

        *  Observation 'd[i]' is discarded, and row 'i' and column 'i'
           deleted from 'A' (giving 'A[-i,-i]'). Thus 'd' and 'A' are
           the observation vector and correlation matrix that would
           have been obtained had observation 'i' not  been available.

        *  The value of 'd[i]' is estimated on the basis of the
           shortened observation vector and the comatrix of 'A'.

     It is then possible to make a scatterplot of 'd' vs 'dhat' where
     'dhat=estimator(val,A,d)'.  If the scales used are "good", then
     the points of this scatterplot will be close to 'abline(0,1)'. 
     The third step is to optimize the goodness of fit of this
     scatterplot.

_V_a_l_u_e:

     A vector of observations of the same length as 'd'.

_A_u_t_h_o_r(_s):

     Robin K. S. Hankin

_R_e_f_e_r_e_n_c_e_s:

        *  J. Oakley and A. O'Hagan, 2002. _Bayesian Inference for the
           Uncertainty Distribution of Computer Model Outputs_,
           Biometrika 89(4), pp769-784

        *  R. K. S. Hankin 2005. _Introducing BACCO, an R bundle for
           Bayesian analysis of computer code output_, Journal of
           Statistical Software, 14(16)

_S_e_e _A_l_s_o:

     'optimal.scales'

_E_x_a_m_p_l_e_s:

     # example has 40 observations on 6 dimensions.
     # function is just sum( (1:6)*x) where x=c(x_1, ... , x_2)

     val <- latin.hypercube(40,6)
     colnames(val) <- letters[1:6]
     d <- apply(val,1,function(x){sum((1:6)*x)})

     #pick some scales:
     fish <- rep(1,ncol(val))
     A <- corr.matrix(val,scales=fish, power=2)

     #add some suitably correlated noise:
     d <- as.vector(rmvnorm(n=1, mean=d, 0.1*A))

     # estimate d using the leave-out-one technique in estimator():
     d.est <- estimator(val, A, d, scales=fish, power=2)

     #and plot the result:
     lims <- range(c(d,d.est))
     par(pty="s")
     plot(d, d.est, xaxs="r", yaxs="r", xlim=lims, ylim=lims)
     abline(0,1)
       

