| variableSelector {BAMD} | R Documentation |
This function carries out variable selection on the following linear mixed model
Y = X β + Z gamma + ε
where the covariates for the random effects (in the Z-matrix) have missing values. The Z-matrix consists of Single Nucelotide Polymorphism (SNP) data and the Y-vector contains the phenotypic trait of interest. The X-matrix typically describes the family structure of the organisms.
The best models are determined by their Bayes Factor, and uses the imputed
values from the gibbsSampler function.
variableSelector(fname, n, p, s, nsim, keep = 5, prop = 0.75,
codaOut="CodaChain.txt", codaIndex="CodaIndex.txt",
missingfile = "Imputed_missing_vals", SNPsubset)
fname |
fname should be the name of a .csv file. This file should
contain the Y, X, Z and R matrices for the model, in that particular order. Hence it
should contain n times (1 + p + s + n) values. There should be a header rown in the
input file as well. The Z matrix should use the values 1,2,3 for the SNPs and 0 for any missing SNPs.
The program will convert the SNP codings to -1,0,1 and work with those. |
n |
n refers to the length of the Y-vector; equivalent to the number of
observations in the dataset. |
p |
p is the number of columns of the X-matrix. |
s |
s is the number of columns of the Z-matrix. Note that this is the total number of original SNPs put through the Gibbs sampler. |
nsim |
nsim specifies the number of iterations of the Metropolis-Hastings
chain to carry out. |
keep |
keep specifies the number of models to store. The top
keep models will be retained. |
prop |
As the candidate distribution for the Metropolis-Hastings chain is a mixture, one
of whose components is a random walk, prop will determine the percentage of time that
the random walk distribution is chosen. |
codaOut |
This is the name of the file that was output from gibbsSampler. It contains the values obtained from the Gibbs sampler. |
codaIndex |
This is the name of the file that describes the format of the variables in codaOut. |
missingfile |
Contains the missing SNP values that were output from gibbsSampler. |
SNPsubset |
A 0-1 vector of length s, indicating the SNPs that should be considered as possible variables. |
A Metropolis-Hastings algorithm is used to conduct a stochastic search through the model space to find the best models.
A matrix consisting of the best keep models and their Bayes Factors is returned.
Vik Gopal viknesh@stat.ufl.edu
Maintainer: Vik Gopal <viknesh@stat.ufl.edu>
Gopal, V. "BAMD User Manual" http://www.stat.ufl.edu/~viknesh/assoc_model/assoc.html
# Load example matrices and write to csv files.
data(Y, X, Z, R, Zprob)
write.csv(cbind(Y,X,Z,R), file="generatedData.csv", quote=FALSE, row.names=FALSE)
write.csv(Zprob, file="Zprob.csv", quote=FALSE, row.names=FALSE)
# Run the gibbs sampler with 100 iterations, keeping the last 800
gibbsSampler(fname="generatedData.csv", fprob="Zprob.csv", n=8, p=3, s=5, nsim=1000, keep=800)
# Imputed values from gibbs sampler will be used in Variable Selector
variableSelector(fname="generatedData.csv", n=8, p=3, s=5, nsim=100, keep = 5)
#remove all generated csv files
unlink("*.csv")
unlink("*.txt")