| msc.peaks.clust {caMassClass} | R Documentation |
Clusters peaks from multiple protein mass spectra (SELDI) samples
msc.peaks.clust(dM, S, BinSize=c(0,sum(dM)), tol=0.97, verbose=FALSE)
S |
Peak sample number, used to identify the spectrum the peak come from. |
dM |
Distance between sorted peak positions (masses, m/z). |
BinSize |
Upper and lower bound of bin-sizes, based on expected
experimental variation in the mass (m/z) values. Size of any bin is
measured as (R-L)/mean(R,L) where L and R are masses
(m/z values) of left and right boundaries.
All resulting bin sizes will be between BinSize[1] and
BinSize[2]. Default is c(0,sum(dM)) which ensures that no
BinSizes is not being used. |
tol |
gaps bigger than tol*max(gap) are assumed to be the same
size as the largest gap. See details. |
verbose |
boolean flag turns debugging printouts on. |
This is a low level function used by msc.peaks.alignment and not intended to
be directly used by many users. However it might be useful for other code
developers. It clusters peaks from different samples into bins in
such a way as to satisfy constraints in following order:
BinSize[1] and BinSize[2]
The output is binary array of the same size as dM and S where
left boundaries of each clusters-bin (biomarker) are marked
Jarek Tuszynski (SAIC) jaroslaw.w.tuszynski@saic.com
The initial version of this function started as implementation of algorithm described on webpage of Virginia Prostate Center (at Virginia Medical School) documenting their PeakMiner Software. See http://www.evms.edu/vpc/seldi/peakminer.pdf
msc.preprocess.run and
msc.project.run pipelines.
msc.peaks.find
msc.peaks.align and
msc.biomarkers.fill
msc.peaks.align function
# example with simple made up data (18 peaks, 3 samples)
M = c(1,5,8,12,17,22, 3,5,7,11,14,25, 1, 5, 7,10,17,21) # peak position/mass
S = rep(1:3, each=6) # peak's sample number
idx = sort(M, index=TRUE)$ix # sort peaks by mass
M = M[idx] # sorted mass
S = S[idx] # arrange sample numbers in the same order
bin = msc.peaks.clust(diff(M), S, verbose=TRUE)
rbind(S,M,bin) # show results
# use the results to align peaks into biomarkers matrix
Bmrks = matrix(NA,sum(bin),max(S)) # init feature (biomarker) matrix
bin = cumsum(bin) # find bin numbers for each peak in S array
for (j in 1:length(S)) # Bmrks usually store height H of each peak
Bmrks[bin[j], S[j]] = M[j]; # but in this example it will be mass
Bmrks
stopifnot( dim(Bmrks)==c(7,3) )
stopifnot( sum(is.na(Bmrks[5,]))==2 )