| TermDocumentMatrix {tm} | R Documentation |
Constructs a term-document matrix or a document-term matrix.
TermDocumentMatrix(object, control = list()) DocumentTermMatrix(object, control = list())
object |
a corpus |
control |
a named list of control options. The component
weighting must be a weighting function capable of handling a
TermDocumentMatrix. It defaults to weightTf for term
frequency weighting. All other options are delegated internally to a
termFreq call. |
An object of class TermDocumentMatrix or class
DocumentTermMatrix containing a sparse term-document matrix or
document-term matrix. The following slots contain useful information:
Weighting |
The weighting applied to the matrix. |
Ingo Feinerer
The documentation of termFreq gives an extensive list of
possible options.
Available weighting functions shipped with the tm
package are weightTf, weightTfIdf, and
weightBin.
data("crude")
tdm <- TermDocumentMatrix(crude, control = list(weighting = weightTfIdf, stopwords = TRUE))
dtm <- DocumentTermMatrix(crude, control = list(weighting = weightTfIdf, stopwords = TRUE))
inspect(tdm[165:170,1:5])
inspect(dtm[1:5,165:170])