baysout                package:dprep                R Documentation

_O_u_t_l_i_e_r _d_e_t_e_c_t_i_o_n  _u_s_i_n_g _t_h_e _B_a_y _a_n_d _S_c_h_w_a_b_a_c_h_e_r'_s _a_l_g_o_r_i_t_h_m.

_D_e_s_c_r_i_p_t_i_o_n:

     This function implements the algorithm for outlier detection found
     in Bay and Schwabacher(2003), which assigns an outlyingness
     measure to observations and  returns the indexes of those having
     the largest measures. The number of outliers to be returned as
     specified by the user.

_U_s_a_g_e:

     baysout(D, blocks = 5, k = 3, num.out = 10)

_A_r_g_u_m_e_n_t_s:

       D: the dataset under study

  blocks: the number of sections in which to divide the entire dataset.
          It  must be  at least as large as the number of outliers
          requested. 

       k: the number of neighbors to find for each observation

 num.out: the number of outliers to return

_V_a_l_u_e:

 num.out: Returns a two column matrix containing the indexes of the
          observations with the top num.out outlyingness measures. A
          plot of the top candidates and their measures is also
          displayed. 

_A_u_t_h_o_r(_s):

     Caroline Rodriguez(2004). Modified by Elio Lozano (2005)

_R_e_f_e_r_e_n_c_e_s:

     Bay, S.D., and Schwabacher (2003). Mining distance-based  outliers
     in near linear time with randomization and a simple pruning rule.

_E_x_a_m_p_l_e_s:

     #---- Outliers detection using the Bay's algorithm----
     data(bupa)
     bupa.out=baysout(bupa[bupa[,7]==1,1:6],blocks=10,num.out=10)

