Simulate Correlated Progression-Free Survival and Overall Survival Using a Gumbel Copula
Source:vignettes/simulatePfsAndOsGumbel.Rmd
simulatePfsAndOsGumbel.RmdProgression-free survival (PFS) and overall survival (OS) are common time-to-event endpoints in oncology trials. Because progression or death defines a PFS event, simulated trial data should always satisfy . At the same time, the two endpoints are often highly correlated, so independent marginal simulation is usually not appropriate.
This vignette describes the copula-based generator implemented in
TrialSimulator::CorrelatedPfsAndOs2(). The function is
intended for simulation settings where the user wants to specify three
interpretable quantities:
- the marginal median PFS,
- the marginal median OS,
- Kendall’s tau between the observed, uncensored PFS and OS times.
The method uses a Gumbel survival copula for a latent time to progression (TTP) and OS, and then defines
This construction guarantees for every simulated patient. It also keeps both marginal PFS and OS exponential, so the inputs can be specified directly through their medians.
This is a useful distinction from the illness-death model in
TrialSimulator::CorrelatedPfsAndOs3(). The illness-death
model gives a clinically interpretable transition structure, but OS
generated from that model can have a time-varying hazard ratio between
treatment arms. Therefore, if OS will be analyzed using a proportional
hazards Cox model and the simulation should be consistent with that
analysis model, the copula-based method described here is often more
appropriate.
Model
Let denote latent TTP and let denote OS. The model assumes exponential marginal survival functions
and a Gumbel–Hougaard survival copula
The observed PFS time is
Under this model, OS is exponential with rate . PFS is also exponential, with survival function
where
Therefore, if the target medians are for PFS and for OS,
For a candidate value of , the latent TTP rate is then
This requires , as expected for PFS and OS.
Choosing the Copula Parameter
For the Gumbel copula, Kendall’s tau between latent TTP and OS is
However, CorrelatedPfsAndOs2() asks for Kendall’s tau
between the observed, uncensored endpoints
and
,
not between latent TTP and OS. These are not the same quantity. The
formula below is derived in the appendix. Under the model above,
Equivalently, writing and ,
CorrelatedPfsAndOs2() solves this equation numerically
for the latent Kendall’s tau
,
converts it to
,
and then simulates from the Gumbel copula.
For any finite , is larger than :
Intuitively, even if latent TTP and OS have a weaker association, observed PFS contains deaths by construction, which increases the association between observed PFS and OS.
The requested value of has a lower bound. When , TTP and OS are independent, but PFS still contains OS deaths through the definition . In this limiting case,
Thus, for fixed medians
and
,
CorrelatedPfsAndOs2() can only target
If the requested Kendall’s tau is too small for the two medians, the function stops with an error.
Example
Suppose we want to simulate PFS with median 5 months and OS with median 11 months, targeting Kendall’s tau of 0.6 between observed, uncensored PFS and OS times.
pfs_and_os <- endpoint(name = c('pfs', 'os'),
type = c('tte', 'tte'),
generator = CorrelatedPfsAndOs2,
median_pfs = 5,
median_os = 11,
kendall = 0.6,
pfs_name = 'pfs',
os_name = 'os')
pfs_and_osFor verification only, we can call the generator directly. This is
not the recommended way to use TrialSimulator for
simulation studies; in practice, the generator should usually be
supplied to endpoint() and used inside the trial simulation
workflow. Direct calls are helpful here because they let us check the
marginal medians, the observed Kendall’s tau, and the ordering
before adding enrollment, censoring, treatment arms, and analyses.
set.seed(123)
dat <- CorrelatedPfsAndOs2(n = 100000,
median_pfs = 5,
median_os = 11,
kendall = 0.6)
head(dat, 2)
#> pfs os pfs_event os_event
#> 1 6.828217 14.728535 1 1
#> 2 2.153555 2.617195 1 1The simulation should approximately recover the requested medians and Kendall’s tau between observed, uncensored PFS and OS times.
with(dat, median(pfs))
#> [1] 4.989699
with(dat, median(os))
#> [1] 11.0149
with(dat, cor(pfs, os, method = 'kendall'))
#> [1] 0.598662
with(dat, all(pfs <= os))
#> [1] TRUEBecause this generator returns event indicators equal to 1 for both endpoints, censoring and staggered enrollment should be handled by the broader trial simulation workflow rather than by this endpoint generator.
Practical Guidance
This generator is most useful when the simulation objective is to specify marginal PFS and OS medians and a rank-based association between the two observed endpoints. Kendall’s tau is often more stable and interpretable than Pearson correlation for skewed time-to-event variables, and it is directly connected to the Gumbel copula parameter.
There are still modeling assumptions to keep in mind:
- PFS and OS are both marginally exponential.
- Dependence is induced through a Gumbel survival copula on latent TTP and OS.
- The Gumbel copula only allows non-negative dependence.
- The requested Kendall’s tau is for observed PFS and OS event times before censoring is applied.
- The attainable Kendall’s tau between observed PFS and OS is constrained by the PFS/OS median ratio.
For simulations that require transition-specific hazards or a
mechanistic disease process, the illness-death model implemented in
TrialSimulator::CorrelatedPfsAndOs3() may be more
appropriate. For simulations where the main requirement is a direct
medians-plus-Kendall’s-tau specification, especially when OS will be
analyzed under a proportional hazards Cox model,
CorrelatedPfsAndOs2() provides a compact alternative.
Appendix: Kendall’s Tau for Observed PFS and OS
This appendix gives the derivation behind the formula used in
CorrelatedPfsAndOs2().
Let and be nonnegative random variables with exponential margins
and joint survival function
Define . We want Kendall’s tau between and .
Standardize the margins by setting
Then the joint survival function of is
Use the polar-type transformation
A Jacobian calculation gives the joint density
Thus, and are independent and . The inverse transformation is
Let
The event is equivalent to :
so
Using the survival-copula representation of Kendall’s tau,
When , and
When , . Since
we have, using ,
Therefore,
By independence of and ,
where . From the density of ,
In particular,
For the integral term, use the substitution
Then
Substituting gives
Because
the integral is
Thus,
using . Finally,
Since has rate
this can also be written as
Because
,
the formula used by CorrelatedPfsAndOs2() is