| convertRCV1Plain {tm} | R Documentation |
Transform a Reuters Corpus Volume 1 XML document to a plain text document.
convertRCV1Plain(node, ...)
node |
an XML node representing a <newsitem></newsitem> element from a well-formed RCV1 XML file. |
... |
Arguments passed over by calling functions. |
A PlainTextDocument representing node.
Ingo Feinerer
rcv1 <- system.file("texts", "rcv1", package = "tm")
rcv1TDC <- TextDocCol(DirSource(rcv1), readerControl = list(reader = readRCV1, language = "en_US", load = TRUE))
rcv1TDC[[1]]
asPlain(rcv1TDC[[1]], convertRCV1Plain)