bootTWIX                package:TWIX                R Documentation

_B_o_o_t_s_t_r_a_p _o_f _t_h_e _T_W_I_X _t_r_e_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Bootstrap samples of the Greedy-TWIX-trees.

_U_s_a_g_e:

     bootTWIX(formula, data=NULL,test.data=0,N=1,topN=1,subset=NULL,
                         method="deviance",topn.method="complete",
                         cluster=NULL,minsplit=30,minbucket=round(minsplit/3),
                         Devmin=0.1,level=20,score=1,tol=0.15)

_A_r_g_u_m_e_n_t_s:

 formula: formula of the form 'y ~ x1 + x2 + ...', where 'y' must be a
          factor and 'x1,x2,...' are numeric.

    data: an optional data frame containing the variables in the model
          (training data).

test.data: a data frame containing new data.

       N: an integer giving the number of bootstrap replications.

    topN: integer vector. How many splits will be selected and at which
          level? If length 1, the same size of splits will be selected
          at each level. If length > 1, for example 'topN=c(3,2)', 3
          splits will be chosen at first level, 2 splits at second
          level and for all next levels 1 split.

  subset: an optional vector specifying a subset of observations to be
          used.

  method: Which split points will be used? This can be '"deviance"'
          (default), '"grid"' or '"local"'. If the 'method' is set to:
           '"local"' - the program uses the local maxima of the split
          function(entropy),
           '"deviance"' - all values of the entropy,
           '"grid"' - grid points.

topn.method: one of '"complete"'(default) or '"single"'. A
          specification of the consideration of the split points. If
          set to '"complete"' it uses split points from all variables,
          else it uses split points per variable.

 cluster: name of the cluster, if parallel computing will be used.

minsplit: the minimum number of observations that must exist in a node.

minbucket: the minimum number of observations in any terminal <leaf>
          node.

  Devmin: the minimum improvement on entropy by splitting.

   level: maximum depth of the trees. If 'level' set to 1, trees
          consist of root node.

   score: a parameter, which can be '1'(default) or '2'. If it is '2'
          the _sort_-function will be used,
           if it set to '1' _weigth_-function will be used
           'score =
          0.25*scale(dev.tr)+0.6*scale(fit.tr)+0.15*(tree.structure)'


     tol: parameter, which will be used, if 'topn.method' is set to
          '"single"'.

_V_a_l_u_e:

     a list with the following components : 

    call: the call generating the object.

   trees: a list of all constructed trees, which include ID, Dev ...
          for each tree.

_S_e_e _A_l_s_o:

     'get.tree', 'predict.TWIX', 'deviance.TWIX','bagg.TWIX',

_E_x_a_m_p_l_e_s:

     data(olives)
     #Tree <- bootTWIX(Region~.,data=olives,N=5)
     #Tree$trees

