BinaryTree Class            package:party            R Documentation

_C_l_a_s_s "_B_i_n_a_r_y_T_r_e_e"

_D_e_s_c_r_i_p_t_i_o_n:

     A class for representing binary trees.

_O_b_j_e_c_t_s _f_r_o_m _t_h_e _C_l_a_s_s:

     Objects can be created by calls of the form 'new("BinaryTree",
     ...)'. The most important slot is 'tree', a (recursive) list with
     elements

     _n_o_d_e_I_D an integer giving the number of the node, starting with '1'
          in the root node.

     _w_e_i_g_h_t_s the case weights (of the learning sample) corresponding to
          this node.

     _c_r_i_t_e_r_i_o_n a list with test statistics and p-values for each
          partial hypothesis.

     _t_e_r_m_i_n_a_l a logical specifying if this is a terminal node.

     _p_s_p_l_i_t primary split: a list with elements 'variableID' (the
          number of the input variable splitted), 'ordered' (a logical
          whether the input variable is ordered), 'splitpoint' (the
          cutpoint or set of levels to the left), 'splitstatistics'
          saves the process of standardized  two-sample statistics the
          split point estimation is based on. The logical 'toleft'
          determines if observations go left or right down the tree.
          For nominal splits, the slot 'table' is a vector being
          greater zero if the corresponding level is available in the
          corresponding node.

     _s_s_p_l_i_t_s a list of surrogate splits, each with the same elements as
          'psplit'.

     _p_r_e_d_i_c_t_i_o_n the prediction of the node: the mean for numeric
          responses and the conditional class probabilities for 
          nominal or ordered respones. For censored responses, this is
          the mean of the logrank scores and useless as such.

     _l_e_f_t a list representing the left daughter node. 

     _r_i_g_h_t a list representing the right daugther node.

     Please note that this data structure may be subject to change in
     future releases of the package.

_S_l_o_t_s:

     '_d_a_t_a': an object of class 'ModelEnv'.

     '_r_e_s_p_o_n_s_e_s': an object of class '"VariableFrame"' storing the
          values of the response variable(s). 

     '_c_o_n_d__d_i_s_t_r__r_e_s_p_o_n_s_e': a function computing the conditional
          distribution of the response.  

     '_p_r_e_d_i_c_t__r_e_s_p_o_n_s_e': a function for computing predictions. 

     '_t_r_e_e': a recursive list representing the tree. See above. 

     '_w_h_e_r_e': an integer vector of length n (number of observations in
          the learning sample) giving the number of the terminal node
          the corresponding observations is element of. 

     '_p_r_e_d_i_c_t_i_o_n__w_e_i_g_h_t_s': a function for extracting weights from
          terminal nodes. 

     '_g_e_t__w_h_e_r_e': a function for determining the number of terminal
          nodes observations fall into. 

_E_x_t_e_n_d_s:

     Class '"BinaryTreePartition"', directly.

_M_e_t_h_o_d_s:

     '_r_e_s_p_o_n_s_e(_o_b_j_e_c_t, ...)': extract the response variables the tree
          was fitted to.

     '_t_r_e_e_r_e_s_p_o_n_s_e(_o_b_j_e_c_t, _n_e_w_d_a_t_a = _N_U_L_L, ...)': compute statistics
          for the conditional distribution of the response as modelled
          by the tree. For regression problems, this is just the mean.
          For nominal or ordered responses, estimated conditional class
          probabilities are returned. Kaplan-Meier curves are computed
          for censored responses. Note that a list with one element for
          each observation is returned.

     '_P_r_e_d_i_c_t(_o_b_j_e_c_t, _n_e_w_d_a_t_a = _N_U_L_L, ...)': compute predictions.

     '_w_e_i_g_h_t_s(_o_b_j_e_c_t, _n_e_w_d_a_t_a = _N_U_L_L, ...)': extract the weight vector
          from terminal nodes each element of the learning sample is
          element of ('newdata = NULL') and for new observations, 
          respectively.

     '_w_h_e_r_e(_o_b_j_e_c_t, _n_e_w_d_a_t_a = _N_U_L_L, ...)': extract the number of  the
          terminal nodes each element of the learning sample is element
          of ('newdata = NULL') and for new observations, 
          respectively.

     '_n_o_d_e_s(_o_b_j_e_c_t, _w_h_e_r_e, ...)': extract the nodes with given number
          ('where').

     '_p_l_o_t(_x, ...)': a plot method for 'BinaryTree' objects, see
          'plot.BinaryTree'.

     '_p_r_i_n_t(_x, ...)': a print method for 'BinaryTree' objects.

_E_x_a_m_p_l_e_s:

       airq <- subset(airquality, !is.na(Ozone))
       airct <- ctree(Ozone ~ ., data = airq,   
                      controls = ctree_control(maxsurrogate = 3))

       ### distribution of responses in the terminal nodes
       plot(airq$Ozone ~ as.factor(where(airct)))

       ### get all terminal nodes from the tree
       nodes(airct, unique(where(airct)))

       ### extract weights and compute predictions
       pmean <- sapply(weights(airct), function(w) weighted.mean(airq$Ozone, w))

       ### the same as
       drop(Predict(airct))

       ### or
       unlist(treeresponse(airct))

       ### don't use the mean but the median as prediction in each terminal node
       pmedian <- sapply(weights(airct), function(w) 
                                             median(airq$Ozone[rep(1:nrow(airq), w)]))

       plot(airq$Ozone, pmean, col = "red")
       points(airq$Ozone, pmedian, col = "blue")

