| similarity-methods {arulesSequences} | R Documentation |
Provides the generic function similarity and the S4 method
to compute similarities among a collection of sequences.
is.subset, is.superset find subsequence or supersequence
relationships among a collection of sequences.
similarity(x, y = NULL, ...)
## S4 method for signature 'sequences':
similarity(x, y = NULL,
method = c("jaccard", "dice", "cosine", "subset"),
strict = FALSE)
## S4 method for signature 'sequences':
is.subset(x, y = NULL, proper = FALSE)
## S4 method for signature 'sequences':
is.superset(x, y = NULL, proper = FALSE)
x, y |
an object. |
... |
further (unused) arguments. |
method |
a string specifying the similarity measure to use (see details). |
strict |
a logical value specifying if strict itemset matching should be used. |
proper |
a logical value specifying if only strict relationships (omitting equality) should be indicated. |
Let the number of common elements of two sequences refer to those that occur in a longest common subsequence. The following similarity measures are implemented:
jaccard:dice:cosine:subset:
If strict = TRUE the elements (itemsets) of the sequences must
be equal to be matched. Otherwise matches are quantified by the
similarity of the itemsets (as specified by method) thresholded
at 0.5, and the common sequence by the sum of the similarities.
For similarity, returns an object of class
dsCMatrix if the result
is symmetric (or method = "subset") and and object of
class dgCMatrix otherwise.
For is.subset, is.superset returns an object of class
lgCMatrix.
Computation of the longest common subsequence of two sequences of
length n, m takes O(n*m) time.
The supported set of operations for the above matrix classes depends
on package Matrix. In case of problems, expand to full storage
representation using as(x, "matrix") or as.matrix(x).
For efficiency use as(x, "dist") to convert a symmetric
result matrix for clustering.
Christian Buchta
Class
sequences,
method
dissimilarity.
## use example data data(zaki) z <- as(zaki, "timedsequences") similarity(z) # require equality similarity(z, strict = TRUE) ## emphasize common similarity(z, method = "dice") ## is.subset(z) is.subset(z, proper = TRUE)