errormatrix               package:klaR               R Documentation

_T_a_b_u_l_a_t_i_o_n _o_f _p_r_e_d_i_c_t_i_o_n _e_r_r_o_r_s _b_y _c_l_a_s_s_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Cross-tabulates true and predicted classes  with the option to
     show relative frequencies.

_U_s_a_g_e:

     errormatrix(true, predicted, relative = FALSE)

_A_r_g_u_m_e_n_t_s:

    true: Vector of true classes.

predicted: Vector of predicted classes.

relative: Logical. If 'TRUE' rows are normalized to show relative
          frequencies (see below).

_D_e_t_a_i_l_s:

     Given vectors of true and predicted classes, a (symmetric)  table
     of misclassifications is constructed.

     Element [i,j] shows the number of objects of class i that  were
     classified as class j; so the main diagonal shows the  correct
     classifications. The last row and column show the corresponding
     sums of misclassifications, the lower right  element is the total
     sum of misclassifications.

     If ''relative'' is 'TRUE', the _rows_ are  normalized so they show
     relative frequencies instead. The  lower right element now shows
     the total error rate, and the remaining last row sums up to one,
     so it shows "where the misclassifications went".

_V_a_l_u_e:

     A (named) matrix.

_N_o_t_e:

     Concerning the case that ''relative'' is 'TRUE':

     If a prior distribution over the classes is given, the 
     misclassification rate that is returned as the lower right 
     element (which is only the fraction of misclassified  _data_) is
     not an estimator for the expected  misclassification rate.

     In that case you have to multiply the individual error rates for 
     each class (returned in the last column) with the corresponding
     prior probabilities and sum these up (see example below).

     Both error rate estimates are equal, if the fractions of classes 
     in the data are equal to the prior probabilities.

_A_u_t_h_o_r(_s):

     Christian Rver, roever@statistik.uni-dortmund.de

_S_e_e _A_l_s_o:

     'table'

_E_x_a_m_p_l_e_s:

     data(iris)
     library(MASS)
     x <- lda(Species ~ Sepal.Length + Sepal.Width, data=iris)
     y <- predict(x, iris)

     # absolute numbers: 
     errormatrix(iris$Species, y$class)

     # relative frequencies: 
     errormatrix(iris$Species, y$class, relative = TRUE)

     # percentages: 
     round(100 * errormatrix(iris$Species, y$class, relative = TRUE), 0)

     # expected error rate in case of class prior: 
     indiv.rates <- errormatrix(iris$Species, y$class, relative = TRUE)[1:3, 4]
     prior <- c("setosa" = 0.2, "versicolor" = 0.3, "virginica" = 0.5)
     total.rate <- t(indiv.rates) %*% prior
     total.rate

