| gal_all {CORREP} | R Documentation |
The data is compiled by Mario Medvedovic et al, 2003 based on the original full data reported in Ideker et al, 2001. There are a total of 205 rows (genes), 20 experiments, and 4 repeated measurements in the data. There are 4 classes (which correspond to functional categories). The data contains approximatly 8 of missing data. The missing values were filled by applying k-nearest neighbor (k = 12) to impute all the missing values.
data(gal_all)
A data frame with 205 variables on the following 80 replicated observations.
wtRG1wtRG2wtRG3wtRG4gal1RG1gal1RG2gal1RG3gal1RG4gal2RG1gal2RG2gal2RG3gal2RG4gal3RG1gal3RG2gal3RG3gal3RG4gal4RG1gal4RG2gal4RG3gal4RG4gal5RG1gal5RG2gal5RG3gal5RG4gal6RG1gal6RG2gal6RG3gal6RG4gal7RG1gal7RG2gal7RG3gal7RG4gal10RG1gal10RG2gal10RG3gal10RG4gal80RG1gal80RG2gal80RG3gal80RG4wtR1wtR2wtR3wtR4gal1R1gal1R2gal1R3gal1R4gal2R1gal2R2gal2R3gal2R4gal3R1gal3R2gal3R3gal3R4gal4R1gal4R2gal4R3gal4R4gal5R1gal5R2gal5R3gal5R4gal6R1gal6R2gal6R3gal6R4gal7R1gal7R2gal7R3gal7R4gal10R1gal10R2gal10R3gal10R4gal80R1gal80R2gal80R3gal80R4The 205 genes have been classified into four functuional classes based on their GO annotations. In the data exmaple provided in the vignette, we assume the four classes as true memberships (external knowledge) and use it to evaluate the performances of different correlation measured based clustering methods.
http://expression.microslu.washington.edu/expression/kayee/medvedovic2003/medvedovic_bioinf2003.html
Medvedovic M, Yeung KY and Bumgarner RE. 2004. Bayesian Mixture Model Based Clustering of Replicated Microarray Data. Bioinformatics, 22;20(8):1222-32. Ideker, T., Thorsson, V., Siegel, A. and Hood, L. Testing for Differentially-Expressed Genes by Maximum-Likelihood Analysis of DNA Microarray Data. Journal of Computational Biology 7: 805-817 (2000).
data(gal_all) ## maybe str(gal_all) ; plot(gal_all) ...