s2n                  package:seqinr                  R Documentation

_s_i_m_p_l_e _n_u_m_e_r_i_c_a_l _e_n_c_o_d_i_n_g _o_f _a _D_N_A _s_e_q_u_e_n_c_e.

_D_e_s_c_r_i_p_t_i_o_n:

     By default, if no 'levels' arguments is provided, this function
     will just code your DNA sequence in integer values following the
     lexical order '(a > c > g > t)', that is 0 for "a", 1 for "c", 2
     for "g", 3 for "t" and NA for ambiguous bases.

_U_s_a_g_e:

     s2n(seq, levels, base4 = TRUE)

_A_r_g_u_m_e_n_t_s:

     seq: a vector of chars 

  levels: allowed char values, by default a, c, g and t 

   base4: if TRUE the numerical encoding will start at O, if FALSE at 1

     ...: further arguments to factor 

_V_a_l_u_e:

     a vector of integers

_N_o_t_e:

     The idea of starting numbering at 0 by default is that it enforces
      a kind of isomorphism between the paste operator on DNA chars and
      the + operator on integer coding for DNA chars. By this way, you
     can work either in the char set, either in the integer set,
     depending on what is more convenient for your purpose, and then
     switch from one  set to the other one as you like.

_A_u_t_h_o_r(_s):

     J.R. Lobry

_R_e_f_e_r_e_n_c_e_s:

     'citation("seqinr")'

_S_e_e _A_l_s_o:

     'n2s', 'factor', 'unclass'

_E_x_a_m_p_l_e_s:

     #example of default behaviour
     urndna <- c("a","c","g","t")
     seq <- sample( urndna, 100, replace = TRUE ) ; seq
     s2n(seq)
     #How to deal with RNA
     urnrna <- c("a","c","g","t")
     seq <- sample( urnrna, 100, replace = TRUE ) ; seq
     s2n(seq)
     #what's happen with unknown characters
     urnmess <- c(urndna,"n")
     seq <- sample( urnmess, 100, replace = TRUE ) ; seq
     s2n(seq)
     #How to change the encoding for unknown characters
     tmp <- s2n(seq) ; tmp[is.na(tmp)] <- -1; tmp

