| refdata {ref} | R Documentation |
Function refdata creates objects of class refdata which behave not totally unlike matrices or data.frames but allow for much more memory efficient handling.
# -- usage for R CMD CHECK, see below for human readable version ----------- refdata(x) derefdata(x) derefdata(x) <- value ## S3 method for class 'refdata': x[i = NULL, j = NULL, drop = FALSE, ref = FALSE] ## S3 method for class 'refdata': x[i = NULL, j = NULL, ref = FALSE] <- value ## S3 method for class 'refdata': dim(x) ## S3 method for class 'refdata': dimnames(x) ## S3 method for class 'refdata': row.names(x) ## S3 method for class 'refdata': names(x) # -- most important usage for human beings -------------------------------- # rd <- refdata(x) # create reference # derefdata(rd) # retrieve original data # derefdata(rd) <- value # modify original data # rd[] # get all (current) data # rd[i, j] # get part of data # rd[i, j, ref=TRUE] # get new reference on part of data # rd[i, j] <- value # modify part of data (now rd is reference on local copy of the data) # rd[i, j, ref=TRUE] <- value # modify part of original data (respecting subsetting history) # dim(rd) # dim of (subsetted) data # dimnames(rd) # dimnames of (subsetted) data
x |
a matrix or data.frame or any other 2-dimensional object that has operators "[" and "[<-" defined |
i |
row index |
j |
col index |
ref |
FALSE by default. In subsetting: FALSE returns data, TRUE returns new refdata object. In assignments: FALSE modifies a local copy and returns a refdata object embedding it, TRUE modifies the original. |
drop |
FALSE by default, i.e. returned data have always a dimension attribute. TRUE drops dimension in some cases, the exact result depends on whether a matrix or data.frame is embedded |
value |
some value to be assigned |
Refdata objects store 2D-data in one environment and index information in another environment. Derived refdata objects usually share the data environment but not the index environment.
The index information is stored in a standardized and memory efficient form generated by optimal.index.
Thus refdata objects can be copied and subsetted and even modified without duplicating the data in memory.
Empty square bracket subsetting (rd[]) returns the data, square bracket subsetting (rd[i, j]) returns subsets of the data as expected.
An additional argument (rd[i, j, ref=TRUE]) allows to get a reference that stores the subsetting indices. Such a reference behaves transparently as if a smaller matrix/data.frame would be stored and can be subsetted again recursively.
With ref=TRUE indices are always interpreted as row/col indices, i.e. x[i] and x[cbind(i, j)] are undefined (and raise stop errors)
Standard square bracket assignment (rd[i, j] <- value) creates a reference to a locally modified copy of the (potentially subsetted) data.
An additional argument (rd[i, j, ref=TRUE] <- value) allows to modify the original data, properly recognizing the subsetting history.
A method dim(refdata) returns the dim of the (indexed) data.
A dimnames(refdata) returns the dimnames of the (indexed) data.
an object of class refdata (appended to class attributes of data), which is an empty list with two attributes
dat |
the environment where the data x and its dimension dim is stored |
ind |
the environment where the indexes i, j and the effective subset size ni, nj is stored |
The refdata code is currently R only (not implemented for S+).
Please note the following differences to matrices and dataframes:
x[]x[] instead of x in order to get all current datadrop=FALSEx[i]x[][i] instead, but beware of differences between matrices and dataframesx[cbind()]x[][cbind(i, j)] insteadref=TRUEref needs to be used sensibly to exploit the advantages of refdata objectsJens Oehlschlägel
Extract, matrix, data.frame, optimal.index, ref
## Simple usage Example
x <- cbind(1:5, 5:1) # take a matrix or data frame
rx <- refdata(x) # wrap it into an refdata object
rx # see the autoprinting
rm(x) # delete original to save memory
rx[] # extract all data
rx[-1, ] # extract part of data
rx2 <- rx[-1, , ref=TRUE] # create refdata object referencing part of data (only index, no data is duplicated)
rx2 # compare autoprinting
rx2[] # extract 'all' data
rx2[-1, ] # extract part of (part of) data
cat("for more examples look the help pages\n")
## Not run:
# Memory saving demos
square.matrix.size <- 1000
recursion.depth.limit <- 10
non.referenced.matrix <- matrix(1:(square.matrix.size*square.matrix.size), nrow=square.matrix.size, ncol=square.matrix.size)
rownames(non.referenced.matrix) <- paste("a", seq(length=square.matrix.size), sep="")
colnames(non.referenced.matrix) <- paste("b", seq(length=square.matrix.size), sep="")
referenced.matrix <- refdata(non.referenced.matrix)
recurse.nonref <- function(m, depth.limit=10){
x <- m[1,1] # need read access here to create local copy
gc()
cat("depth.limit=", depth.limit, " memory.size=", memsize.wrapper(), "\n", sep="")
if (depth.limit)
Recall(m[-1, -1, drop=FALSE], depth.limit=depth.limit-1)
invisible()
}
recurse.ref <- function(m, depth.limit=10){
x <- m[1,1] # read access, otherwise nothing happens
gc()
cat("depth.limit=", depth.limit, " memory.size=", memsize.wrapper(), "\n", sep="")
if (depth.limit)
Recall(m[-1, -1, ref=TRUE], depth.limit=depth.limit-1)
invisible()
}
gc()
memsize.wrapper()
recurse.ref(referenced.matrix, recursion.depth.limit)
gc()
memsize.wrapper()
recurse.nonref(non.referenced.matrix, recursion.depth.limit)
gc()
memsize.wrapper()
rm(recurse.nonref, recurse.ref, non.referenced.matrix, referenced.matrix, square.matrix.size, recursion.depth.limit)
## End(Not run)
cat("for even more examples look at regression.test.refdata()\n")
regression.test.refdata() # testing correctness of refdata functionality