Skip to contents

Progression-free survival (PFS) and overall survival (OS) are common time-to-event endpoints in oncology trials. Because progression or death defines a PFS event, simulated trial data should always satisfy PFSOS\mbox{PFS} \leq \mbox{OS}. At the same time, the two endpoints are often highly correlated, so independent marginal simulation is usually not appropriate.

This vignette describes the copula-based generator implemented in TrialSimulator::CorrelatedPfsAndOs2(). The function is intended for simulation settings where the user wants to specify three interpretable quantities:

  • the marginal median PFS,
  • the marginal median OS,
  • Kendall’s tau between the observed, uncensored PFS and OS times.

The method uses a Gumbel survival copula for a latent time to progression (TTP) and OS, and then defines

PFS=min(TTP,OS). \mbox{PFS} = \min(\mbox{TTP}, \mbox{OS}).

This construction guarantees PFSOS\mbox{PFS} \leq \mbox{OS} for every simulated patient. It also keeps both marginal PFS and OS exponential, so the inputs can be specified directly through their medians.

This is a useful distinction from the illness-death model in TrialSimulator::CorrelatedPfsAndOs3(). The illness-death model gives a clinically interpretable transition structure, but OS generated from that model can have a time-varying hazard ratio between treatment arms. Therefore, if OS will be analyzed using a proportional hazards Cox model and the simulation should be consistent with that analysis model, the copula-based method described here is often more appropriate.

Model

Let XX denote latent TTP and let YY denote OS. The model assumes exponential marginal survival functions

Pr(X>x)=exp(λXx),Pr(Y>y)=exp(λYy), \Pr(X > x) = \exp(-\lambda_X x), \qquad \Pr(Y > y) = \exp(-\lambda_Y y),

and a Gumbel–Hougaard survival copula

Pr(X>x,Y>y)=exp[{(λXx)θ+(λYy)θ}1/θ],θ1. \Pr(X > x, Y > y) = \exp\left[ -\left\{(\lambda_X x)^\theta + (\lambda_Y y)^\theta\right\}^{1/\theta} \right], \qquad \theta \geq 1.

The observed PFS time is

P=min(X,Y). P = \min(X, Y).

Under this model, OS is exponential with rate λY\lambda_Y. PFS is also exponential, with survival function

Pr(P>t)=Pr(X>t,Y>t)=exp[{(λXθ+λYθ)tθ}1/θ]=exp(λPt), \Pr(P > t) = \Pr(X > t, Y > t) = \exp\left[ -\left\{(\lambda_X^\theta + \lambda_Y^\theta)t^\theta\right\}^{1/\theta} \right] = \exp(-\lambda_P t),

where

λP=(λXθ+λYθ)1/θ. \lambda_P = \left(\lambda_X^\theta + \lambda_Y^\theta\right)^{1/\theta}.

Therefore, if the target medians are mPm_P for PFS and mYm_Y for OS,

λP=log(2)mP,λY=log(2)mY. \lambda_P = \frac{\log(2)}{m_P}, \qquad \lambda_Y = \frac{\log(2)}{m_Y}.

For a candidate value of θ\theta, the latent TTP rate is then

λX=(λPθλYθ)1/θ. \lambda_X = \left(\lambda_P^\theta - \lambda_Y^\theta\right)^{1/\theta}.

This requires mP<mYm_P < m_Y, as expected for PFS and OS.

Choosing the Copula Parameter

For the Gumbel copula, Kendall’s tau between latent TTP and OS is

τ(X,Y)=11θ. \tau(X,Y) = 1 - \frac{1}{\theta}.

However, CorrelatedPfsAndOs2() asks for Kendall’s tau between the observed, uncensored endpoints P=min(X,Y)P=\min(X,Y) and YY, not between latent TTP and OS. These are not the same quantity. The formula below is derived in the appendix. Under the model above,

τ(P,Y)=11θ[1(mPmY)θ]. \tau(P,Y) = 1 - \frac{1}{\theta} \left[ 1 - \left(\frac{m_P}{m_Y}\right)^\theta \right].

Equivalently, writing τX=τ(X,Y)\tau_X = \tau(X,Y) and θ=1/(1τX)\theta = 1/(1-\tau_X),

τ(P,Y)=1(1τX)[1(mPmY)1/(1τX)]. \tau(P,Y) = 1 - (1-\tau_X) \left[ 1 - \left(\frac{m_P}{m_Y}\right)^{1/(1-\tau_X)} \right].

CorrelatedPfsAndOs2() solves this equation numerically for the latent Kendall’s tau τX\tau_X, converts it to θ\theta, and then simulates from the Gumbel copula.

For any finite θ\theta, τ(P,Y)\tau(P,Y) is larger than τ(X,Y)\tau(X,Y):

τ(P,Y)τ(X,Y)=1θ(mPmY)θ>0. \tau(P,Y)-\tau(X,Y) = \frac{1}{\theta} \left(\frac{m_P}{m_Y}\right)^\theta > 0.

Intuitively, even if latent TTP and OS have a weaker association, observed PFS contains deaths by construction, which increases the association between observed PFS and OS.

The requested value of τ(P,Y)\tau(P,Y) has a lower bound. When θ=1\theta=1, TTP and OS are independent, but PFS still contains OS deaths through the definition P=min(X,Y)P=\min(X,Y). In this limiting case,

τ(P,Y)=mPmY. \tau(P,Y) = \frac{m_P}{m_Y}.

Thus, for fixed medians mPm_P and mYm_Y, CorrelatedPfsAndOs2() can only target

mPmYτ(P,Y)<1. \frac{m_P}{m_Y} \leq \tau(P,Y) < 1.

If the requested Kendall’s tau is too small for the two medians, the function stops with an error.

Example

Suppose we want to simulate PFS with median 5 months and OS with median 11 months, targeting Kendall’s tau of 0.6 between observed, uncensored PFS and OS times.

pfs_and_os <- endpoint(name = c('pfs', 'os'),
                       type = c('tte', 'tte'),
                       generator = CorrelatedPfsAndOs2,
                       median_pfs = 5,
                       median_os = 11,
                       kendall = 0.6,
                       pfs_name = 'pfs',
                       os_name = 'os')

pfs_and_os

For verification only, we can call the generator directly. This is not the recommended way to use TrialSimulator for simulation studies; in practice, the generator should usually be supplied to endpoint() and used inside the trial simulation workflow. Direct calls are helpful here because they let us check the marginal medians, the observed Kendall’s tau, and the ordering PFSOS\mbox{PFS} \leq \mbox{OS} before adding enrollment, censoring, treatment arms, and analyses.

set.seed(123)
dat <- CorrelatedPfsAndOs2(n = 100000,
                           median_pfs = 5,
                           median_os = 11,
                           kendall = 0.6)

head(dat, 2)
#>        pfs        os pfs_event os_event
#> 1 6.828217 14.728535         1        1
#> 2 2.153555  2.617195         1        1

The simulation should approximately recover the requested medians and Kendall’s tau between observed, uncensored PFS and OS times.

with(dat, median(pfs))
#> [1] 4.989699
with(dat, median(os))
#> [1] 11.0149
with(dat, cor(pfs, os, method = 'kendall'))
#> [1] 0.598662
with(dat, all(pfs <= os))
#> [1] TRUE

Because this generator returns event indicators equal to 1 for both endpoints, censoring and staggered enrollment should be handled by the broader trial simulation workflow rather than by this endpoint generator.

Practical Guidance

This generator is most useful when the simulation objective is to specify marginal PFS and OS medians and a rank-based association between the two observed endpoints. Kendall’s tau is often more stable and interpretable than Pearson correlation for skewed time-to-event variables, and it is directly connected to the Gumbel copula parameter.

There are still modeling assumptions to keep in mind:

  • PFS and OS are both marginally exponential.
  • Dependence is induced through a Gumbel survival copula on latent TTP and OS.
  • The Gumbel copula only allows non-negative dependence.
  • The requested Kendall’s tau is for observed PFS and OS event times before censoring is applied.
  • The attainable Kendall’s tau between observed PFS and OS is constrained by the PFS/OS median ratio.

For simulations that require transition-specific hazards or a mechanistic disease process, the illness-death model implemented in TrialSimulator::CorrelatedPfsAndOs3() may be more appropriate. For simulations where the main requirement is a direct medians-plus-Kendall’s-tau specification, especially when OS will be analyzed under a proportional hazards Cox model, CorrelatedPfsAndOs2() provides a compact alternative.

Appendix: Kendall’s Tau for Observed PFS and OS

This appendix gives the derivation behind the formula used in CorrelatedPfsAndOs2().

Let XX and YY be nonnegative random variables with exponential margins

XExp(λX),YExp(λY), X \sim \operatorname{Exp}(\lambda_X), \qquad Y \sim \operatorname{Exp}(\lambda_Y),

and joint survival function

SX,Y(x,y)=Pr(X>x,Y>y)=exp[{(λXx)θ+(λYy)θ}1/θ],θ1. S_{X,Y}(x,y) = \Pr(X>x, Y>y) = \exp\left[ -\left\{(\lambda_X x)^\theta + (\lambda_Y y)^\theta\right\}^{1/\theta} \right], \qquad \theta \ge 1.

Define P=min(X,Y)P=\min(X,Y). We want Kendall’s tau between PP and YY.

Standardize the margins by setting

A=λXX,B=λYY. A=\lambda_X X, \qquad B=\lambda_Y Y.

Then the joint survival function of (A,B)(A,B) is

Pr(A>a,B>b)=exp{(aθ+bθ)1/θ}. \Pr(A>a, B>b) = \exp\left\{-(a^\theta+b^\theta)^{1/\theta}\right\}.

Use the polar-type transformation

R=(Aθ+Bθ)1/θ,W=AθAθ+Bθ. R=(A^\theta+B^\theta)^{1/\theta}, \qquad W=\frac{A^\theta}{A^\theta+B^\theta}.

A Jacobian calculation gives the joint density

fR,W(r,w)=1θer(r+θ1),r>0,0<w<1. f_{R,W}(r,w)=\frac{1}{\theta} e^{-r}(r+\theta-1), \qquad r>0,\; 0<w<1.

Thus, RR and WW are independent and WUnif(0,1)W \sim \operatorname{Unif}(0,1). The inverse transformation is

A=RW1/θ,B=R(1W)1/θ. A = R W^{1/\theta}, \qquad B = R(1-W)^{1/\theta}.

Let

a=λXθ,b=λYθ,c=aa+b,d=ba+b. a=\lambda_X^\theta, \qquad b=\lambda_Y^\theta, \qquad c=\frac{a}{a+b}, \qquad d=\frac{b}{a+b}.

The event XYX \le Y is equivalent to WcW \le c:

XYAλXBλYWc, X \le Y \iff \frac{A}{\lambda_X} \le \frac{B}{\lambda_Y} \iff W \le c,

so

P={X,Wc,Y,W>c. P= \begin{cases} X, & W \le c,\\ Y, & W > c. \end{cases}

Using the survival-copula representation of Kendall’s tau,

τ(P,Y)=4E{SP,Y(P,Y)}1. \tau(P,Y)=4E\{S_{P,Y}(P,Y)\}-1.

When WcW \le c, P=XP=X and

SP,Y(P,Y)=SX,Y(X,Y)=eR. S_{P,Y}(P,Y)=S_{X,Y}(X,Y)=e^{-R}.

When W>cW>c, P=YP=Y. Since

SP,Y(y,y)=Pr(P>y,Y>y)=Pr(X>y,Y>y), S_{P,Y}(y,y)=\Pr(P>y,Y>y)=\Pr(X>y,Y>y),

we have, using Y=B/λY=R(1W)1/θ/λYY=B/\lambda_Y=R(1-W)^{1/\theta}/\lambda_Y,

SP,Y(Y,Y)=exp[R(1Wd)1/θ]. S_{P,Y}(Y,Y) = \exp\left[ -R\left(\frac{1-W}{d}\right)^{1/\theta} \right].

Therefore,

E{SP,Y(P,Y)}=E{eR𝟏(Wc)}+E[eR((1W)/d)1/θ𝟏(W>c)]. E\{S_{P,Y}(P,Y)\} = E\{e^{-R}\mathbf 1(W\le c)\} + E\left[ e^{-R((1-W)/d)^{1/\theta}} \mathbf 1(W>c) \right].

By independence of RR and WW,

E{SP,Y(P,Y)}=cLR(1)+c1LR{(1wd)1/θ}dw, E\{S_{P,Y}(P,Y)\} = cL_R(1) + \int_c^1 L_R\left\{\left(\frac{1-w}{d}\right)^{1/\theta}\right\} dw,

where LR(s)=E(esR)L_R(s)=E(e^{-sR}). From the density of RR,

LR(s)=1θ0e(1+s)r(r+θ1)dr=θ+(θ1)sθ(1+s)2. L_R(s) = \frac{1}{\theta} \int_0^\infty e^{-(1+s)r}(r+\theta-1)\,dr = \frac{\theta+(\theta-1)s}{\theta(1+s)^2}.

In particular,

LR(1)=2θ14θ. L_R(1)=\frac{2\theta-1}{4\theta}.

For the integral term, use the substitution

t=(1wd)1/θ,dw=dθtθ1dt. t=\left(\frac{1-w}{d}\right)^{1/\theta}, \qquad dw=-d\theta t^{\theta-1}dt.

Then

c1LR{(1wd)1/θ}dw=dθ01tθ1LR(t)dt. \int_c^1 L_R\left\{\left(\frac{1-w}{d}\right)^{1/\theta}\right\} dw = d\theta\int_0^1 t^{\theta-1}L_R(t)\,dt.

Substituting LR(t)L_R(t) gives

d01θtθ1+(θ1)tθ(1+t)2dt. d\int_0^1 \frac{\theta t^{\theta-1}+(\theta-1)t^\theta}{(1+t)^2} dt.

Because

ddt(tθ1+t)=θtθ1+(θ1)tθ(1+t)2, \frac{d}{dt}\left(\frac{t^\theta}{1+t}\right) = \frac{\theta t^{\theta-1}+(\theta-1)t^\theta}{(1+t)^2},

the integral is

d[tθ1+t]01=d2. d\left[\frac{t^\theta}{1+t}\right]_0^1 = \frac{d}{2}.

Thus,

E{SP,Y(P,Y)}=c2θ14θ+d2=2θ1+d4θ, E\{S_{P,Y}(P,Y)\} = c\frac{2\theta-1}{4\theta}+\frac{d}{2} = \frac{2\theta-1+d}{4\theta},

using c+d=1c+d=1. Finally,

τ(P,Y)=4E{SP,Y(P,Y)}1=θ1+dθ=11θ(λXθλXθ+λYθ). \tau(P,Y) = 4E\{S_{P,Y}(P,Y)\}-1 = \frac{\theta-1+d}{\theta} = 1-\frac{1}{\theta} \left( \frac{\lambda_X^\theta}{\lambda_X^\theta+\lambda_Y^\theta} \right).

Since PP has rate

λP=(λXθ+λYθ)1/θ, \lambda_P = \left(\lambda_X^\theta+\lambda_Y^\theta\right)^{1/\theta},

this can also be written as

τ(P,Y)=11θ(λXθλPθ)=11θ[1(λYλP)θ]. \tau(P,Y) = 1-\frac{1}{\theta} \left( \frac{\lambda_X^\theta}{\lambda_P^\theta} \right) = 1-\frac{1}{\theta} \left[ 1-\left(\frac{\lambda_Y}{\lambda_P}\right)^\theta \right].

Because λY/λP=mP/mY\lambda_Y/\lambda_P = m_P/m_Y, the formula used by CorrelatedPfsAndOs2() is

τ(P,Y)=11θ[1(mPmY)θ]. \tau(P,Y) = 1-\frac{1}{\theta} \left[ 1-\left(\frac{m_P}{m_Y}\right)^\theta \right].