| corpus_sample {lsa} | R Documentation |
Generate a random sample of a document collection.
corpus_sample( filelist, samplesize, index.return=FALSE)
filelist |
a vector containing (relative or absolute) filenames. |
samplesize |
the desired number of files to be returned. |
index.return |
if set to TRUE, the position of the sample files in filelist will be returned. |
Creates a random sample of the size samplesize of
the specified filelist.
x |
The random sample; a vector with filenames. |
x |
If index.return is set to TRUE, a list is returned; x contains
the filenames and ix contains the position of the sample files in the
original filelist. |
Fridolin Wild fridolin.wild@wu-wien.ac.at
# create some files
td = tempfile()
dir.create(td)
write( c("dog", "cat", "mouse"), file=paste(td, "D1", sep="/") )
write( c("hamster", "mouse", "sushi"), file=paste(td, "D2", sep="/") )
write( c("dog", "monster", "monster"), file=paste(td, "D3", sep="/") )
s = corpus_sample(dir(td, full.names=TRUE), 2, index.return=TRUE)
textmatrix(s$x)
# clean up
unlink(td)