| generateArtificialLongData {kml} | R Documentation |
This function is used to builp up artificial data set of longitudinal data.
gald(name="",clusterNames="",nbEachClusters=50,
functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}),
functionNoise=function(t,sdSeq){rnorm(1,0,3)},
time=0:10,decimal=2,percentOfMissing=0)
generateArtificialLongData(name="",clusterNames="",nbEachClusters=50,
functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}),
functionNoise=function(t,sdSeq){rnorm(1,0,3)},
time=0:10,decimal=2,percentOfMissing=0)
name |
[character]: name of the data set. |
clusterNames |
[vector(character)]: name of each clusters. |
nbEachClusters |
[vector(numeric)]: number of trajectories that each cluster must contain. If a single number is given, it is duplicated for all groups (see detail). |
functionClusters |
[list(function)]: lists the functions defining the average trajectories of each cluster. If a single function is given, it is duplicated for all groups (see detail). |
functionNoise |
[list(function)]: lists the functions generating the noise of each trajectory within its own cluster. If a single function is given, it is duplicated for all groups (see detail). |
time |
[vector(numeric)]: time at which measures are made. |
decimal |
[numeric]: number of decimals used to round up values. |
percentOfMissing |
[numeric]: percentage (between 0 and 1) of missing data generated in each cluster. If a single value is given, it is duplicated for all groups (see detail). |
generateArtificialLongData (gald in short) is a
function that enables the user to contruct artificial trajectories.
Each individual is considered as belonging to a group. This group
follows a theoretical trajectory, function of time. These functions (one per group) are given via the argument functionClusters.
Within a group, the individual undergoes individal variations. Individual variations are given via the argument functionNoise.
The number of individuals in each group is given by nbEachClusters.
Finally, it is possible to add missing values randomly striking the
data thanks to percentOfMissing.
Note that the number of cluster is define as the biggest length of
variables nbEachClusters, functionCluters,
functionNoise and percentOfMissing. So at least one of
these four variables should be define for each clusters.
An object of class ArtificialLongData.
Christophe Genolini
PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health
INSERM U669 / Maison de Solenn / Paris
Contact author: <genolini@u-paris10.fr>
Raphaël Ricaud
Laboratoire "Sport & Culture" / "Sports & Culture" Laboratory
University of Paris 10 / Nanterre
Article submited
Web site: http://christophe.genolini.free.fr/kml
Overview: kml-package
Classes: ArtificialLongData
Methods: kml, choice
Plot: plot(ClusterizLongData),
plotSubGroups(ClusterizLongData), plotAll(ClusterizLongData)
par(ask=TRUE)
### Default example
ex1 <- generateArtificialLongData()
ex1
plot(ex1,colorTraj="black",colorMean="no")
plot(ex1)
### Three diverging lines
ex2 <- generateArtificialLongData(functionClusters=list(function(t)0,function(t)-t,function(t)t))
ex2
plot(ex2,colorTraj="black",colorMean="no")
plot(ex2)
### Three diverging lines with high variance, unbalance groups and missing value
ex3 <- generateArtificialLongData(
functionClusters=list(function(t)0,function(t)-t,function(t)t),
nbEachClusters=c(40,20,10),
functionNoise=function(t){rnorm(1,0,3)},
percentOfMissing=c(0.25,0.5,0.25)
)
ex3
plot(ex3,colorTraj="black",colorMean="no")
plot(ex3)
### Four strange functions
ex4 <- generateArtificialLongData(
name="Four strange functions",
clusterNames=c("Line","Poly2","Normal","Sinus"),
nbEachClusters=c(100,300,200,100),
functionClusters=list(function(t){-10+2*t},function(t){-0.6*t^2+6*t-7.5},function(t){10*sin(t)},function(t){30*dnorm(t,2,1.5)}),
functionNoise=function(t){rnorm(1,0,5)},
time=0:10,decimal=2,percentOfMissing=0.3)
ex4
plot(ex4,colorTraj="black",colorMean="no")
plot(ex4)
### To get only trajectories (if you want some artificial trajectories
### to deal with another algorithm), use the getteur ["traj"]
ex5 <- gald(nbEachCluster=3,time=1:3)
ex5["traj"]
par(ask=FALSE)