chargaff               package:seqinr               R Documentation

_B_a_s_e _c_o_m_p_o_s_i_t_i_o_n _i_n _s_s_D_N_A _f_o_r _7 _b_a_c_t_e_r_i_a_l _D_N_A

_D_e_s_c_r_i_p_t_i_o_n:

     Long before the genomic era, it was possible to get some data for
     the global composition of single-stranded DNA chromosomes by
     direct chemical analyses. These data are from Chargaff's lab and
     give the base composition of the L (Ligth) strand for 7 bacterial
     chromosomes.

_U_s_a_g_e:

     data(chargaff)

_F_o_r_m_a_t:

     A data frame with 7 observations on the following 4 variables.

     [_A] frequencies of A bases in percent

     [_G] frequencies of G bases in percent

     [_C] frequencies of C bases in percent

     [_T] frequencies of T bases in percent

_D_e_t_a_i_l_s:

     Data are from Table 2 in Rudner _et al._ (1969) for the L-strand.
     Data for _Bacillus subtilis_ were taken from a previous paper:
     Rudner _et al._ (1968). This is in fact the average value observed
     for two different strains of _B. subtilis_: strain W23 and strain
     Mu8u5u16.
      Denaturated chromosomes can be separated by a technique of
     intermitent gradient elution from a column of methylated albumin
     kieselguhr (MAK), into two fractions, designated, by virtue of
     their buoyant densities, as L (light) and H (heavy). The fractions
     can be hydrolyzed and subjected to chromatography to determined
     their global base composition.
      The surprising result is that we have almost exactly A=T and C=G
     in single stranded-DNAs. The second paragraph page 157 in Rudner
     _et al._ (1969) says: "Our previous work on the complementary
     strands of _B. subtilis_ DNA suggested an additional, entirely
     unexpected regularity, namely, the equality in either strand of
     6-amino and 6-keto nucleotides ( A + C = G + T). This
     relationship, which would normally have been regarded merely as
     the consequence of base-pairing in DNA duplex and would not have
     been predicted as a likely property of a single strand, is shown
     here to apply to all strand specimens isolated from denaturated
     DNA of the AT type (Table 2, preps. 1-4). It cannot yet be said to
     be established for the DNA specimens from the equimolar and GC
     types (nos. 5-7)."

_S_o_u_r_c_e:

     Rudner, R., Karkas, J.D., Chargaff, E. (1968) Separation of _B.
     subtilis_ DNA into complementary strands, III. Direct Analysis.
     _Proceedings of the National Academy of Sciences of the United
     States of America_, *60*:921-922.
      Rudner, R., Karkas, J.D., Chargaff, E. (1969) Separation of
     microbial deoxyribonucleic acids into complementary strands.
     _Proceedings of the National Academy of Sciences of the United
     States of America_, *63*:152-159.

_R_e_f_e_r_e_n_c_e_s:

     Try 'example(chargaff)' to mimic figure page 17 in <URL:
     http://pbil.univ-lyon1.fr/members/lobry/articles/HDR.pdf>. The red
     areas correspond to non-allowed values beause the sum of the four
     bases frequencies cannot exceed 100%. The white areas correspond
     to possible values (more exactly to the projection from 'R^4' to
     the corresponding 'R^2' planes of the region of allowed values).
     The blue lines correspond to the very small subset of allowed
     values for which we have in addition PR2 state, that is '[A]=[T]'
     and '[C]=[G]'. Remember, these data are for ssDNA !

     'citation("seqinr")'

_E_x_a_m_p_l_e_s:

     data(chargaff)
     op <- par(no.readonly = TRUE)
     par(mfrow=c(4,4))
     xlim <- c(0,100)
     ylim <- xlim
     par(mai=rep(0,4))
     par(c(0.01, 0.99, 0.01, 0.99))
     par(xaxs="i")
     par(yaxs="i")

     for( i in 1:4 )
     {
       for( j in 1:4 )
       {
         if( i == j )
         {
           plot(chargaff[,i], chargaff[,j],t="n", xlim=xlim, ylim=ylim,
           xlab="", ylab="", xaxt="n", yaxt="n")
           polygon(x=c(0,0,100,100),y=c(0,100,100,0), col="lightgrey")
           for( k in seq(0,100,by=10) )
           {
             lseg <- 3
             segments(k,0,k,lseg)
             segments(k,100-lseg,k,100)
             segments(0,k,lseg,k)
             segments(100-lseg,k,100,k)
           }
           string <- paste(names(chargaff)[i],"\n\n",xlim[1],"% -",xlim[2],"%")
           text(x=mean(xlim),y=mean(ylim), string, cex = 1.5)
         }
         else
         {
           plot(chargaff[,i], chargaff[,j], pch=20, xlim=xlim, ylim=ylim,
           xlab="",ylab="", xaxt="n", yaxt="n")
           iname <- names(chargaff)[i]
           jname <- names(chargaff)[j]
           direct <- function() segments(0,0,50,50, col="blue")
           invers <- function() segments(0,50,50,0, col="blue")
           PR2 <- function()
           {
             if( iname == "[A]" & jname == "[T]" ) { direct(); return() }
             if( iname == "[T]" & jname == "[A]" ) { direct(); return() }
             if( iname == "[C]" & jname == "[G]" ) { direct(); return() }
             if( iname == "[G]" & jname == "[C]" ) { direct(); return() }
             invers()
           }
           PR2()
           polygon(x=c(0,100,100), y=c(100,100,0), col="lightpink")
           polygon(x=c(0,0,100), y=c(0,100,0))
         }
       }
     }
     # Clean up
     par(op)

