| surv2.neyman {surv2sample} | R Documentation |
Compares survival distributions in two samples of censored data using (possibly data-driven) Neyman's smooth test.
surv2.neyman(x, group, data.driven = FALSE, subsets = "nested",
d = ifelse(data.driven, 5, 3), d0 = 0,
basis = "legendre", time.transf = "F",
approx = "perm", nsim = 2000, choltol = 1e-07)
## S3 method for class 'surv2.neyman':
summary(object, ...)
x |
a "Surv" object, as returned by the Surv
function. |
group |
a vector indicating to which group each observation belongs. May contain values 1 and 2 only. |
data.driven |
Should the test be data-driven? |
subsets |
the class of subsets of basis functions amonng which
the data-driven test selects. Possible values are "nested"
and "all". |
d |
the number of basis functions for the test with fixed dimension, the maximum number of basis functions for the data-driven test. |
d0 |
the number of high-priority functions for the data-driven test.
The selection rule selects among subsets containing basis functions
1,...,d0. For nested subsets, d0 equal to 0 or 1
is equvalent. For all subsets, d0 equal to 0 means that there
is no high-priority function and any nonempty subset may be selected. |
basis |
the basis of functions. Possible values are "legendre"
for Legendre polynomials and "cos" for cosines. |
time.transf |
the time transformation for basis functions.
Possible values are "F" for the
distribution function (F(t)/F(tau))
(recommended), "A" for the
cumulative hazard (A(t)/A(tau)) and "I"
for no transformation (the linear transformation t/tau). |
approx |
the method of approximating the distribution of the
test statistic. Possible values are "perm" for permutations,
"boot" for the bootstrap, "asympt" for asymptotics. |
nsim |
the number of simulations. This means the number of
permutations or bootstrap samples when approx is
"perm" or "boot". When approx is "asympt",
nsim is the number of simulations to approximate the asymptotic
distribution (only needed for the data-driven test with all subsets
and d0 equal to 0). |
choltol |
a tolerance parameter for the Cholesky decomposition. |
object |
an object of class "surv2.neyman", as returned
by the function surv2.neyman. |
... |
further parameters for printing. |
In general, Neyman's smooth tests are based on embedding the null hypothesis in a d-dimensional alternative. The embedding is here formulated in terms of hazard functions. The logarithm of the hazard ratio is expressed as a combination of d basis functions (Legendre polynomials or cosines) in transformed time, and their significance is tested by a score test. See Kraus (2007a) for details. The quadratic test statistic is asymptotically chi-square distributed with d degrees of freedom.
A data-driven choice of basis functions is possible. The selection is based on a Schwarz-type criterion which is the maximiser of penalised score statistics for over a class of nonempty subsets of {1,...,d}. Either nested subsets with increasing dimension or all subsets may be used. By choosing d0>0 one requires that functions with indexes {1,...,d0} be always included in subsets.
If all subsets are used with d0=0, the data-driven test statistic is asymptotically distributed as the maximum of (generally dependent) chi-square variables with 1 d.f. This asymptotic approximation is accurate. In other cases, the statistic is asymptotically chi-square distributed with d^*=max(1,d0) degrees of freedom. For nested subsets with d^*=1 a two-term approximation may be used (see Kraus (2007b), eq. (12)). Otherwise the asymptotics is unreliable.
In any case, one may use permutations or the bootstrap.
If the test is data-driven, the summary method prints details
on the selection procedure (statistics and penalised statistics for
each considered subset). This is equivalent to print(x, detail=TRUE, ...).
A list of class "surv2.neyman" and "neyman.test",
with main components:
stat |
the test statistic. |
pval |
the p-value. |
stats, stats.penal |
the score statistic and penalised score statistic for each considered subset (only for data-driven tests). |
S.dim |
the dimension of the selected set (only for data-driven tests). |
S.set |
the selected set (only for data-driven tests). |
Most input parameters and some further components are included.
David Kraus (http://www.davidkraus.net/)
Kraus, D. (2007a) Adaptive Neyman's smooth tests of homogeneity of two samples of survival data. Research Report 2187, Institute of Information Theory and Automation, Prague. Available at http://www.davidkraus.net/surv2sample/.
Kraus, D. (2007b) Data-driven smooth tests of the proportional hazards assumption. Lifetime Data Anal. 13, 1–16.
surv2.logrank, surv2.ks,
survdiff, survfit
## gastric cancer data
data(gastric)
## test with fixed dimension
surv2.neyman(Surv(gastric$time, gastric$event), gastric$treatment,
data.driven = FALSE)
## data-driven test with nested subsets
## without minimum dimension (i.e., minimum dimension 1)
summary(surv2.neyman(Surv(gastric$time, gastric$event),
gastric$treatment, data.driven = TRUE, subsets = "nested"))
## with minimum dimension 3
summary(surv2.neyman(Surv(gastric$time, gastric$event),
gastric$treatment, data.driven = TRUE, subsets = "nested",
d0 = 3))
## data-driven test with all subsets
## without high-priority functions
summary(surv2.neyman(Surv(gastric$time, gastric$event),
gastric$treatment, data.driven = TRUE, subsets = "all"))
## with 2 high-priority functions
summary(surv2.neyman(Surv(gastric$time, gastric$event),
gastric$treatment, data.driven = TRUE, subsets = "all",
d0 = 2))