scaleboot             package:scaleboot             R Documentation

_M_u_l_t_i_s_c_a_l_e _B_o_o_t_s_t_r_a_p _R_e_s_a_m_p_l_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     Performs multiscale bootstrap resampling for a specified
     statistic.

_U_s_a_g_e:

     scaleboot(dat,nb,sa,fun,parm=NULL,count=TRUE,weight=TRUE,
               cluster=NULL,onlyboot=FALSE,seed=NULL,...)

     countw.assmax(x,w,ass)

     countw.shtest(x,w,obs)

     countw.shtestass(x,w,assobs)

_A_r_g_u_m_e_n_t_s:

     dat: data matrix or data-frame. Row vectors are to be resampled.

      nb: vector of the numbers of bootstrap replicates.

      sa: vector of scales in sigma squared (sigma^2).

     fun: function for a statistic.

    parm: parameter to be passed to 'fun' above.

   count: logical. Should only the accumulative counts be returned?
          Otherwise, raw statistic vectors are returned.

  weight: logical. In 'fun' above, resampling is specified by a weight
          vector. Otherwise, resampling is specified by a vector of
          indices.

 cluster: 'snow' cluster object which may be generated by function
          'makeCluster'.

onlyboot: logical. Should only bootstrap resampling be performed?
          Otherwise, 'sbfit' or 'sbconf' is called internally.

    seed: If non NULL, random seed is set. Specifying a seed is
          particularly important when 'cluster' is non NULL, in which
          case 'seed + seq(along=cluster)' are set to cluster nodes.

     ...: further arguments passed to and from other methods.

       x: data matrix or data-frame passed from 'scaleboot'.

       w: weight vector for resampling.

     ass: a list of association vectors. An example of 'parm' above.

     obs: a vector of observed test statistics. An example of 'parm'
          above.

  assobs: a list of ass and obs above. An example of 'parm' above.

_D_e_t_a_i_l_s:

     These functions are used internally by 'relltest'. 

     'scaleboot' performs multiscale bootstrap resampling for a
     statistic defined by 'fun', which should be one of the two
     possible forms 'fun(x,w,parm)' and 'fun(x,i,parm)'. The former is
     used when 'weight=TRUE', and the weight vector 'w' is generated by
     a multinomial distribution. The latter is used when
     'weight=FALSE', and the index vector 'i' is generated by
     resampling n' elements from {1,...,n}. When 'count=TRUE', 'fun'
     should return a logical, or a vector of logicals.

     Examples of 'fun(x,w,parm)' are 'countw.assmax' for AU p-values,
     'countw.shtest' for SH-test of trees, and 'countw.shtestass' for
     SH-test of both trees and edges. The definitions are given below.


     countw.assmax <- function(x,w,ass) {
       y <- maxdif(wsumrow(x,w)) <= 0 # countw.max
       if(is.null(ass)) y
       else {
         z <- vector("logical",length(ass))
         for(i in seq(along=ass)) z[i] <- any(y[ass[[i]]])
         z
       }
     }

     countw.shtest <- function(x,w,obs)  maxdif(wsumrow(x,w)) >= obs

     countw.shtestass <- function(x,w,assobs)
       unlist(assmaxdif(wsumrow(x,w),assobs$ass)) >= assobs$obs

     ### weighted sum of row vectors
     ##
     ## x = matrix (array of row vectors)
     ## w = weight vector (for rows)
     ##
     wsumrow <- function(x,w) {
       apply(w*x,2,sum)*nrow(x)/sum(w)
     }

     ### calc max diff
     ##
     ## y[i] := max_{j neq i} x[j] - x[i]
     ##
     maxdif <- function(x) {
       i1 <- which.max(x)  # the largest element
       x <- -x + x[i1]
       x[i1] <- -min(x[-i1])  # the second largest value
       x
     }

     ### calc assmaxdif
     ##
     ## y[[i]][j] := max_{k neq ass[[i]]} x[k] - x[ass[[i]][j]]
     ##
     assmaxdif <-  function(x,a) {
       y <- vector("list",length(a))
       names(y) <- names(a)
       for(i in seq(along=a))  y[[i]] <- max(x[-a[[i]]]) - x[a[[i]]]
       y
     }

     When 'count=TRUE', the summation of outputs from 'fun' is
     calculated. This gives the frequencies for how many times the
     hypotheses are supported by the bootstrap replicates.

_V_a_l_u_e:

     If 'onlyboot=TRUE', then a list of raw results from the multiscale
     bootstrap resampling is returned. The components are "stat" for
     list vectors of outputs from 'fun' (only when 'count=FALSE'),
     "bps" for a matrix of multiscale bootstrap probabilities (only
     when 'count=FALSE'), "nb" for the number of bootstrap replicates
     used, and "sa" for the scales used. Note that scales are redefined
     by 'sa <- nsize/round(nsize/sa)', where 'nsize' is the sample
     size.

     If 'onlyboot=FALSE', then the result of a call to  'sbfit' is
     returned when  'count=TRUE', otherwise the result of 'sbconf' is
     returned when  'count=FALSE'.

_A_u_t_h_o_r(_s):

     Hidetoshi Shimodaira

_S_e_e _A_l_s_o:

     'sbfit', 'relltest'.

_E_x_a_m_p_l_e_s:

     ## Not run: 
     ## a line from the definition of relltest
     scaleboot(dat,nb,sa,countw.assmax,ass,cluster=cluster,
                      names.hp=na,nofit=nofit,models=models,seed=seed)

     ## two lines from rell.shtest (internal function)
     scaleboot(z,nb,1,countw.shtest,tobs,cluster=cluster,
                      onlyboot=TRUE,seed=seed)
     scaleboot(z,nb,1,countw.shtestass,pa,cluster=cluster,
                      onlyboot=TRUE,seed=seed)
     ## End(Not run)

