promotergene             package:kernlab             R Documentation

_E. _c_o_l_i _p_r_o_m_o_t_e_r _g_e_n_e _s_e_q_u_e_n_c_e_s (_D_N_A)

_D_e_s_c_r_i_p_t_i_o_n:

     Promoters have a region where a protein (RNA polymerase) must make
     contact and the helical DNA sequence must have a valid
     conformation so that the two pieces of the contact region
     spatially align. The data contains DNA sequences of promoters and
     non-promoters.

_U_s_a_g_e:

     data(promotergene)

_F_o_r_m_a_t:

     A data frame with 106 observations and 58 variables. The first
     variable 'Class' is a factor with levels '+' for a promoter gene  
     and '-' for a non-promoter gene.  The remaining 57 variables 'V2
     to V58' are factors describing the sequence.  The DNA bases are
     coded as follows: 'a' adenine 'c' cytosine 'g'  guanine 't'
     thymine

_S_o_u_r_c_e:

     UCI Machine Learning data repository 
      <URL:
     ftp://ftp.ics.uci.edu/pub/machine-learning-databases/molecular-bio
     logy/promoter-gene-sequences>

_R_e_f_e_r_e_n_c_e_s:

     Towell, G., Shavlik, J. and Noordewier, M. 
      _Refinement of Approximate Domain Theories by Knowledge-Based
     Artificial Neural Networks._ 
      In Proceedings of the Eighth National Conference on Artificial
     Intelligence (AAAI-90)

_E_x_a_m_p_l_e_s:

     data(promotergene)

     ## Create classification model using Gaussian Processes

     prom <- gausspr(Class~.,data=promotergene,kernel="rbfdot",kpar=list(sigma=0.02),cross=4)
     prom

     ## Create model using Support Vector Machines

     promsv <- ksvm(Class~.,data=promotergene,kernel="laplacedot",kpar="automatic",C=60,cross=4)
     promsv

