sigest                package:kernlab                R Documentation

_H_y_p_e_r_p_a_r_a_m_e_t_e_r _e_s_t_i_m_a_t_i_o_n _f_o_r _t_h_e _G_a_u_s_s_i_a_n _R_a_d_i_a_l _B_a_s_i_s _k_e_r_n_e_l

_D_e_s_c_r_i_p_t_i_o_n:

     Given a range of values for the "sigma" inverse width parameter in
     the Gaussian Radial Basis kernel for use with Support Vector
     Machines. The estimation is based on the data to be used.

_U_s_a_g_e:

     ## S4 method for signature 'formula':
     sigest(x, data=NULL, frac = 0.25, na.action = na.omit, scaled = TRUE)
     ## S4 method for signature 'matrix':
     sigest(x, frac = 0.25, scaled = TRUE, na.action = na.omit)

_A_r_g_u_m_e_n_t_s:

       x: a symbolic description of the model upon the estimation is
          based. When not using a formula x is a matrix or vector
          containing the data

    data: an optional data frame containing the variables in the model.
          By default the variables are taken from the environment which
          `ksvm' is called from.

    frac: Fraction of data to use for estimation. By default a quarter
          of the data is used to estimate the range of the sigma
          hyperparameter.

  scaled: A logical vector indicating the variables to be scaled. If
          'scaled' is of length 1, the value is recycled as many times
          as needed and all non-binary variables are scaled. Per
          default, data are scaled internally to zero mean and unit
          variance (since this the default action in 'ksvm' as well).
          The center and scale values are returned and used for later
          predictions. 

na.action: A function to specify the action to be taken if 'NA's are
          found. The default action is 'na.omit', which leads to
          rejection of cases with missing values on any required
          variable. An alternative is 'na.fail', which causes an error
          if 'NA' cases are found. (NOTE: If given, this argument must
          be named.)

_D_e_t_a_i_l_s:

     'sigest' estimates the range of values for the sigma parameter
     which would return good results when used with a Support Vector
     Machine ('ksvm'). The estimation is based upon the 0.1 and 0.9
     quantile  of |x -x'|^2. Basicly any value in between those two
     bounds will produce good results.

_V_a_l_u_e:

     Returns a vector of length 2 defining the range (upper bound and
     lower bound) of the sigma hyperparameter.

_A_u_t_h_o_r(_s):

     Alexandros Karatzoglou 
       alexandros.karatzoglou@ci.tuwien.ac.at

_R_e_f_e_r_e_n_c_e_s:

     B. Caputo, K. Sim, F. Furesjo, A. Smola, 
      _Appearance-based object recognition using SVMs: which kernel
     should I use?_
      Proc of NIPS workshop on Statitsical methods for computational
     experiments in visual processing and computer vision, Whistler,
     2002.

_S_e_e _A_l_s_o:

     'ksvm'

_E_x_a_m_p_l_e_s:

     ## estimate good sigma values for promotergene
     data(promotergene)
     srange <- sigest(Class~.,data = promotergene)
     srange

     s <- sum(srange)/2
     s
     ## create test and training set
     ind <- sample(1:dim(promotergene)[1],20)
     genetrain <- promotergene[-ind, ]
     genetest <- promotergene[ind, ]

     ## train a support vector machine
     gene <- ksvm(Class~.,data=genetrain,kernel="rbfdot",kpar=list(sigma = s),C=50,cross=3)
     gene

     ## predict gene type on the test set
     promoter <- predict(gene,genetest[,-1])

     ## Check results
     table(promoter,genetest[,1])

