| generateArtificialLongData {longitudinalData} | R Documentation |
This function builp up an artificial longitudinal data set an turn it
into an object of class LongData.
gald(nbEachClusters=50,time=0:10,decimal=2,percentOfMissing=0,
functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}),
functionNoise=function(t){rnorm(1,0,3)})
generateArtificialLongData(nbEachClusters=50,time=0:10,decimal=2,percentOfMissing=0,
functionClusters=list(function(t){0},function(t){t},function(t){10-t},function(t){-0.4*t^2+4*t}),
functionNoise=function(t){rnorm(1,0,3)})
nbEachClusters |
[numeric] or [vector(numeric)]: number of trajectories that each cluster must contain. If a single number is given, it is duplicated for all groups (see detail). |
functionClusters |
[function] or [list(function)]: lists the functions defining the average trajectories of each cluster. If a single function is given, it is duplicated for all groups (see detail). |
functionNoise |
[function] or [list(function)]: lists the functions generating the noise of each trajectory within its own cluster. If a single function is given, it is duplicated for all groups (see detail). |
time |
[vector(numeric)]: time at which measures are made. |
decimal |
[numeric]: number of decimals used to round up values. |
percentOfMissing |
[numeric]: percentage (between 0 and 1) of missing data generated in each cluster. If a single value is given, it is duplicated for all groups (see detail). |
generateArtificialLongData (gald in short) is a
function that contruct a set of artificial longitudinal data.
Each individual is considered as belonging to a group. This group
follows a theoretical trajectory, function of time. These functions (one per group) are given via the argument functionClusters.
Within a group, the individual undergoes individal variations. Individual variations are given via the argument functionNoise.
The number of individuals in each group is given by nbEachClusters.
Finally, it is possible to add missing values randomly striking the
data thanks to percentOfMissing.
Note that the number of cluster is define as the biggest length of
variables nbEachClusters, functionCluters,
functionNoise and percentOfMissing. So at least one of
these four variables should be define for each clusters.
An object of class LongData. Note that the field
other of the object LongData contains the informations that were used to generate
the set of data: functionClusters, functionNoise,
percentOfMissing and trueClusters.
Christophe Genolini
PSIGIAM: Paris Sud Innovation Group in Adolescent Mental Health
INSERM U669 / Maison de Solenn / Paris
Contact author: <genolini@u-paris10.fr>
LongData, longData,
as.longData, plot
par(ask=TRUE)
### Default example
ex1 <- generateArtificialLongData()
ex1
plot(ex1,col=1,type.mean="n")
part1 <- partition(rep(1:4,each=50),4)
plot(ex1,part1)
### Three diverging lines
ex2 <- generateArtificialLongData(functionClusters=list(function(t)0,function(t)-t,function(t)t))
part2 <- partition(rep(1:3,each=50),3)
plot(ex2,part2)
### Three diverging lines with high variance, unbalance groups and missing value
ex3 <- generateArtificialLongData(
functionClusters=list(function(t)0,function(t)-t,function(t)t),
nbEachClusters=c(100,30,10),
functionNoise=function(t){rnorm(1,0,3)},
percentOfMissing=c(0.25,0.5,0.25)
)
part3 <- partition(rep(1:3,c(100,30,10)),3)
plot(ex3,part3)
### Four strange functions
ex4 <- generateArtificialLongData(
nbEachClusters=c(300,200,100,100),
functionClusters=list(function(t){-10+2*t},function(t){-0.6*t^2+6*t-7.5},function(t){10*sin(t)},function(t){30*dnorm(t,2,1.5)}),
functionNoise=function(t){rnorm(1,0,3)},
time=0:10,decimal=2,percentOfMissing=0.3)
part4 <- partition(rep(1:4,c(300,200,100,100)),4)
plot(ex4,part4)
### To get only longData (if you want some artificial longData
### to deal with another algorithm), use the getteur ["traj"]
ex5 <- gald(nbEachCluster=3,time=1:3)
ex5["traj"]
par(ask=FALSE)