uco                  package:seqinr                  R Documentation

_C_o_d_o_n _u_s_a_g_e _i_n_d_i_c_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     'uco' calculates some codon usage indices: the codon counts 'eff',
     the relative frequencies 'freq' or the Relative Synonymous Codon
     Usage 'rscu'.

_U_s_a_g_e:

     uco(seq, frame = 0, index = c("eff", "freq", "rscu"), as.data.frame = FALSE) 

_A_r_g_u_m_e_n_t_s:

     seq: a coding sequence as a vector of chars 

   frame: an integer (0, 1, 2) giving the frame of the coding sequence 

   index: codon usage index choice, partial matching is allowed.  'eff'
          for codon counts,  'freq' for codon relative frequencies, 
          and 'rscu' the RSCU index

as.data.frame: logical. If 'TRUE': all indices are returned into a data
          frame.

_D_e_t_a_i_l_s:

     Codons with ambiguous bases are ignored.

     RSCU is a simple measure of non-uniform usage of synonymous codons
     in a coding sequence (Sharp _et al._ 1986). RSCU values are the
     number of times a particular codon is observed, relative to the
     number  of times that the codon would be observed for a uniform
     synonymous codon usage (i.e. all the codons for a given amino-acid
     have the same probability).  In the absence of any codon usage
     bias, the RSCU values would be 1.00 (this is the case for sequence
     'cds' in the exemple thereafter). A codon that is used  less
     frequently than expected will have an RSCU value of less than 1.00
     and vice versa for a codon  that is used more frequently than
     expected.
     

     Do not use correspondence analysis on RSCU tables as this is a
     source of artifacts  (Perriere and Thioulouse 2002).
     Within-aminoacid correspondence analysis is a simple way to study
     synonymous codon usage (Charif _et al._ 2005).

_V_a_l_u_e:

     If 'as.data.frame' is TRUE 'uco' returns a data frame with five
     columns:  

     aa : a vector containing the name of amino-acid 

  codon : a vector containing the corresponding codon 

    eff : a numeric vector of codon counts 

   freq : a numeric vector of codon relative frequencies 

   rscu : a numeric vector of RSCU index 

    eff : a table of codon counts 

   freq : a table of codon relative frequencies 

   rscu : a vector of relative synonymous codon usage values

_A_u_t_h_o_r(_s):

     D. Charif, J.R. Lobry

_R_e_f_e_r_e_n_c_e_s:

     'citation("seqinr")' 

     Sharp, P.M., Tuohy, T.M.F., Mosurski, K.R. (1986) Codon usage in
     yeast: cluster analysis clearly differentiates highly and lowly
     expressed genes. _Nucl. Acids. Res._, *14*:5125-5143.

     Perriere, G., Thioulouse, J. (2002) Use and misuse of
     correspondence analysis in codon usage studies. _Nucl. Acids.
     Res._, *30*:4548-4555.

     Charif, D., Thioulouse, J., Lobry, J.R., Perriere, G. (2005)
     Online  Synonymous Codon Usage Analyses with the ade4 and seqinR
     packages.  _Bioinformatics_, *21*:545-547. <URL:
     http://pbil.univ-lyon1.fr/members/lobry/repro/bioinfo04/>.

_E_x_a_m_p_l_e_s:

     ## Show all possible codons:
     words()
     ## Make a coding sequence from this:
     (cds <- s2c(paste(words(), collapse = "")))
     ## Get codon counts:
     uco(cds, index = "eff")
     ## Get codon relative frequencies:
     uco(cds, index = "freq")
     ## Get RSCU values:
     uco(cds, index = "rscu")
     ## Show what's happen with ambiguous bases:
     uco(s2c("aaannnttt"))
     ## Use a real coding sequence:
     rcds <- read.fasta(File = system.file("sequences/malM.fasta", package = "seqinr"))[[1]]
     uco( rcds, index = "freq")
     uco( rcds, index = "eff")
     uco( rcds, index = "rscu")
     uco( rcds, as.data.frame = TRUE)

