genotype              package:genetics              R Documentation

_G_e_n_o_t_y_p_e _o_r _H_a_p_l_o_t_y_p_e _O_b_j_e_c_t_s.

_D_e_s_c_r_i_p_t_i_o_n:

     'genotype' creates a genotype object.

     'haplotype' creates a haplotype object.

     'is.genotype' returns 'TRUE' if 'x' is of class 'genotype'

     'is.haplotype' returns 'TRUE' if 'x' is of class 'haplotype'

     'as.genotype' attempts to coerce its argument into an object of
     class 'genotype'.

     'as.genotype.allele.count' converts allele counts (0,1,2) into
     genotype pairs ("A/A", "A/B", "B/B").

     'as.haplotype' attempts to coerce its argument into an object of
     class 'haplotype'.

     'nallele' returns the number of alleles in an object of class
     'genotype'.

_U_s_a_g_e:

       genotype(a1, a2=NULL, alleles=NULL, sep="/", remove.spaces=TRUE,
                reorder = c("yes", "no", "default", "ascii", "freq"),
                allow.partial.missing=FALSE, locus=NULL)

       haplotype(a1, a2=NULL, alleles=NULL, sep="/", remove.spaces=TRUE,
                reorder="no", allow.partial.missing=FALSE, locus=NULL)

       is.genotype(x)

       is.haplotype(x)

       as.genotype(x, ...)

       as.genotype.allele.count(x, alleles=c("A","B"), ... )

       as.haplotype(x, ...)

       print.genotype(x, ...)

       nallele(x)

_A_r_g_u_m_e_n_t_s:

       x: either an object of class 'genotype' or 'haplotype' or an
          object to be converted to class 'genotype' or 'haplotype'.

   a1,a2: vector(s) or matrix containing two alleles for each
          individual. See details, below.

 alleles: names (and order if 'reorder="yes"') of possible alleles.

     sep: character separator or column number used to divide alleles
          when 'a1' is a vector of strings where each string holds both
          alleles. See below for details.

remove.spaces: logical indicating whether spaces and tabs will be
          removed from a1 and a2  before processing.

 reorder: how should alleles within an individual be reordered. If
          'reorder="no"', use the order specified by the alleles
          parameter.  If 'reorder="freq"' or 'reorder="yes"', sort
          alleles within each individual by observed frequency.  If
          'reorder="ascii"', reorder alleles in ASCII order
          (alphabetical, with all upper case before lower case). The
          default value for 'genotype' is '"freq"'.  The default value
          for 'haplotype' is '"no"'. 

allow.partial.missing: logical indicating whether one allele is
          permitted to be missing.  When set to 'FALSE' both alleles
          are set to 'NA' when either is missing.

   locus: object of class locus, gene, or marker, holding information
          about the source of this genotype.

     ...: optional arguments

_D_e_t_a_i_l_s:

     Genotype objects hold information on which gene or marker alleles
     were observed for different individuals.  For each individual, two
     alleles are recorded.

     The genotype class considers the stored alleles to be unordered,
     i.e., "C/T" is equivalent to "T/C".  The haplotype class considers
     the order of the alleles to be significant so that "C/T" is
     distinct from "T/C".

     When calling 'genotype' or 'haplotype':

        *  If only 'a1' is provided and is a character vector, it is
           assumed that each element encodes both alleles. In this
           case, if 'sep' is a character string, 'a1' is assumed to be
           coded as "Allele1<sep>Allele2".  If 'sep' is a numeric
           value, it is assumed that character locations '1:sep'
           contain allele 1 and that remaining locations contain allele
           2.

        *  If 'a1' is a matrix, it is assumed that column 1 contains
           allele 1 and column 2 contains allele 2.

        *  If 'a1' and 'a2' are both provided, each is assumed to
           contain one allele value so that the genotype for an
           individual is obtained by 'paste(a1,a2,sep="/")'.

     If 'remove.spaces' is TRUE, (the default) any whitespace contained
     in 'a1' and 'a2' is removed when the genotypes are created.  If
     whitespace is used as the separator, (eg "C C", "C T", ...), be
     sure to set remove.spaces to FALSE.

     When the alleles are explicitly specified using the 'alleles'
     argument, all potential alleles not present in the list will be
     converted to 'NA'.

     NOTE: 'genotype' assumes that the order of the alleles is not
     important (E.G., "A/C" == "C/A").  Use class 'haplotype' if order
     is significant.

_V_a_l_u_e:

     The genotype class extends "factor" and haplotype extends
     genotype. Both classes have the following attributes: 

  levels: character vector of possible genotype/haplotype values stored
          coded by 'paste( allele1, "/", allele2, sep="")'.

allele.names: character vector of possible alleles. For a SNP, these
          might be c("A","T").   For a variable length dinucleotyde
          repeat this might be c("136","138","140","148"). 

allele.map: matrix encoding how the factor levels correspond to
          alleles.  See the source code to 'allele.genotype()' for how
          to extract allele values using this matrix.  Better yet, just
          use 'allele.genotype()'. 

_A_u_t_h_o_r(_s):

     Gregory R. Warnes Gregory_R_Warnes@groton.pfizer.com and Friedrich
     Leisch.

_S_e_e _A_l_s_o:

     'HWE.test', 'allele', 'homozygote', 'heterozygote',  'carrier',
     'summary.genotype', 'allele.count' 'locus' 'gene' 'marker'

_E_x_a_m_p_l_e_s:

     # several examples of genotype data in different formats
     example.data   <- c("D/D","D/I","D/D","I/I","D/D",
                         "D/D","D/D","D/D","I/I","")
     g1  <- genotype(example.data)
     g1

     example.data2  <- c("C-C","C-T","C-C","T-T","C-C",
                         "C-C","C-C","C-C","T-T","")
     g2  <- genotype(example.data2,sep="-")
     g2

     example.nosep  <- c("DD", "DI", "DD", "II", "DD",
                         "DD", "DD", "DD", "II", "")
     g3  <- genotype(example.nosep,sep="")
     g3

     example.a1 <- c("D",  "D",  "D",  "I",  "D",  "D",  "D",  "D",  "I",  "")
     example.a2 <- c("D",  "I",  "D",  "I",  "D",  "D",  "D",  "D",  "I",  "")
     g4  <- genotype(example.a1,example.a2)
     g4

     example.mat <- cbind(a1=example.a1, a1=example.a2)
     g5  <- genotype(example.mat)
     g5

     example.data5  <- c("D   /   D","D   /   I","D   /   D","I   /   I",
                         "D   /   D","D   /   D","D   /   D","D   /   D",
                         "I   /   I","")
     g5  <- genotype(example.data5,rem=TRUE)
     g5

     # show how genotype and haplotype differ
     data1 <- c("C/C", "C/T", "T/C")
     data2 <- c("C/C", "T/C", "T/C")

     test1  <- genotype( data1 )
     test2  <- genotype( data2 )

     test3  <-  haplotype( data1 )
     test4  <-  haplotype( data2 )

     test1==test2
     test3==test4

     test1=="C/T"
     test1=="T/C"

     test3=="C/T"
     test3=="T/C"

     ## "Messy" example

     m3  <-  c("D D/\t   D D","D\tD/   I",  "D D/   D D","I/   I",
               "D D/   D D","D D/   D D","D D/   D D","D D/   D D",
               "I/   I","/   ","/I")

     genotype(m3)
     summary(genotype(m3))

     m4  <-  c("D D","D I","D D","I I",
               "D D","D D","D D","D D",
               "I I","   ","  I")

     genotype(m4,sep=1)
     genotype(m4,sep=" ",remove.spaces=FALSE)
     summary(genotype(m4,sep=" ",remove.spaces=FALSE))

     m5  <-  c("DD","DI","DD","II",
               "DD","DD","DD","DD",
               "II","   "," I")
     genotype(m5,sep=1)
     haplotype(m5,sep=1,remove.spaces=FALSE)

     g5  <- genotype(m5,sep="")
     h5  <- haplotype(m5,sep="")

     heterozygote(g5) 
     homozygote(g5)    
     carrier(g5,"D")

     g5[9:10]  <- haplotype(m4,sep=" ",remove=FALSE)[1:2]
     g5

     g5[9:10]
     allele(g5[9:10],1)
     allele(g5,1)[9:10]

     # drop unused alleles 
     g5[9:10,drop=TRUE]
     h5[9:10,drop=TRUE]

     # Convert allele.counts into genotype

     x <- c(0,1,2,1,1,2,NA,1,2,1,2,2,2)
     g <- as.genotype.allele.count(x, alleles=c("C","T") )
     g

