EXP                  package:seqinr                  R Documentation

_V_e_c_t_o_r_s _o_f _c_o_e_f_f_i_c_i_e_n_t_s _t_o _c_o_m_p_u_t_e _l_i_n_e_a_r _f_o_r_m_s.

_D_e_s_c_r_i_p_t_i_o_n:

     This dataset is used to compute linear forms on codon frequencies:
     if 'codfreq' is a vector of codon frequencies then 'drop(freq %*%
     EXP$CG3)' will return for instance the G+C content in third codon
     positions. Base order is the lexical order: a, c, g, t (or u).

_U_s_a_g_e:

     data(EXP)

_F_o_r_m_a_t:

     List of 24 vectors of coefficients

     _A    num [1:4] 1 0 0 0 

     _A_3   num [1:64] 1 0 0 0 1 0 0 0 1 0 ...

     _A_G_Z  num [1:64] 0 0 0 0 0 0 0 0 1 0 ...

     _A_R_G  num [1:64] 0 0 0 0 0 0 0 0 1 0 ...

     _A_U_3  num [1:64] 1 0 0 1 1 0 0 1 1 0 ...

     _B_C   num [1:64] 0 1 0 0 0 0 0 0 0 0 ...

     _C    num [1:4] 0 1 0 0

     _C_3   num [1:64] 0 1 0 0 0 1 0 0 0 1 ...

     _C_A_I  num [1:64]  0.00  0.00 -1.37 -2.98 -2.58 ...

     _C_G   num [1:4] 0 1 1 0

     _C_G_1  num [1:64] 0 0 0 0 0 0 0 0 0 0 ...

     _C_G_1_2 num [1:64] 0 0 0 0 0.5 0.5 0.5 0.5 0.5 0.5 ...

     _C_G_2  num [1:64] 0 0 0 0 1 1 1 1 1 1 ...

     _C_G_3  num [1:64] 0 1 1 0 0 1 1 0 0 1 ...

     _C_G_N  num [1:64] 0 0 0 0 0 0 0 0 0 0 ...

     _F_1   num [1:64]  1.026  0.239  1.026  0.239 -0.097 ...

     _G    num [1:4] 0 0 1 0

     _G_3   num [1:64] 0 0 1 0 0 0 1 0 0 0 ...

     _K_D   num [1:64] -3.9 -3.5 -3.9 -3.5 -0.7 -0.7 -0.7 -0.7 -4.5 -0.8
          ...

     _Q    num [1:64] 0 0 0 0 1 1 1 1 0 0 ...

     _Q_A_3  num [1:64] 0 0 0 0 1 0 0 0 0 0 ...

     _Q_C_3  num [1:64] 0 0 0 0 0 1 0 0 0 0 ...

     _U    num [1:4] 0 0 0 1

     _U_3   num [1:64] 0 0 0 1 0 0 0 1 0 0 ...

_D_e_t_a_i_l_s:

     It's better to work directly at the amino-acid level when
     computing linear forms on amino-acid frequencies so as to have a
     single coefficient vector. For instance 'EXP$KD' to compute the
     Kyte and Doolittle hydrophaty index from codon frequencies is
     valid only for the standard genetic code.

      An alternative for 'drop(freq %*% EXP$CG3)' is ' sum( freq *
     EXP$CG3 )', but this is less efficient in terms of CPU time. The
     advantage of the latter, however, is that thanks to recycling
     rules you can use either ' sum( freq * EXP$A )' or ' sum( freq *
     EXP$A3 )'. To do the same with the %*% operator you have to
     explicit the recycling rule as in ' drop( freq %*% rep(EXP$A,
     16))'.

_S_o_u_r_c_e:

     ANALSEQ EXPFILEs for command EXP.
      <URL: http://biomserv.univ-lyon1.fr/doclogi/docanals/manuel.html>

_R_e_f_e_r_e_n_c_e_s:

     'citation("seqinr")'

     _A    content in A nucleotide 

     _A_3   content in A nucleotide in third position of codon      

     _A_G_Z  Arg content (aga and agg codons) 

     _A_R_G  Arg content 

     _A_U_3  content in A and U nucleotides in third position of codon  

     _B_C   Good choice (Bon choix). Gouy M., Gautier C. (1982) codon
          usage in bacteria : Correlation with gene expressivity.
          _Nucleic Acids Research_,*10(22)*:7055-7074. 

     _C    content in C nucleotides 

     _C_3   content in A nucleotides in third position of codon 

     _C_A_I  Codon adaptation index for E. coli. Sharp, P.M., Li, W.-H.
          (1987) The codon adaptation index - a measure of directionam
          synonymous codon usage bias, and its potential applications.
          _Nucleic Acids Research_,*15*:1281-1295.

     _C_G   content in G + C nucleotides 

     _C_G_1  content in G + C nucleotides in first position of codon  

     _C_G_1_2 content in G + C nucleotides in first and second position of
          codon

     _C_G_2  content in G + C nucleotides in second position of codon 

     _C_G_3  content in G + C nucleotides in third position of codon 

     _C_G_N  content in CGA + CGU + CGA + CGG 

     _F_1   From Table 2 in Lobry, J.R., Gautier, C. (1994)
          Hydrophobicity, expressivity and aromaticity are the major
          trends of amino-acid usage in 999 _Escherichia coli_
          chromosome-encode genes. _Nucleic Acids
          Research_,*22*:3174-3180.

     _G_3   content in G nucleotides in third position of codon 

     _K_D   Kyte, J., Doolittle, R.F. (1982) A simple method for
          displaying the hydropathic character of a protein. _J. Mol.
          Biol._,*157* :105-132.

     _Q    content in quartet 

     _Q_A_3  content in quartet with the A nucleotide in third position 

     _Q_C_3  content in quartet with the A nucleotide in third position 

     _U    content in U nucleotide 

     _U_3   content in U nucleotides in third position of codon 

_E_x_a_m_p_l_e_s:

     data(EXP)

