
Define Longitudinal Endpoints in Clinical Trials
Source:vignettes/defineLongitudinalEndpoints.Rmd
defineLongitudinalEndpoints.Rmd
The · package provides a flexible framework for defining endpoints in clinical trial simulations. Users can choose to generate endpoints using either built-in random number generators from standard probability distributions or custom user-defined functions. The ability to specify custom generators is particularly useful when simulating correlated endpoints, which often arise in longitudinal settings.
Longitudinal endpoints refer to outcomes measured repeatedly over time for the same subject, such as blood pressure, biomarker levels, or symptom severity scores. These repeated measures typically exhibit within-subject correlation as well as correlation with baseline values. Accurately simulating such correlation structures is essential for realistic trial design and evaluation.
In this vignette, we demonstrate how to define longitudinal endpoints
using the endpoint()
function, focusing on its
generator
argument to specify the data-generating
mechanism, and the readout
argument to determine when each
endpoint is observed.
Example: Simulating Longitudinal Blood Pressure Endpoints
Suppose blood pressure is measured for each patient at baseline (week
0), and again at weeks 2 and 4 following randomization. We are
interested in changes from baseline at weeks 2 and 4. A typical dataset
at a milestone (e.g., after enrollment or follow-up) that can be
accessed by calling trial$get_locked_data(milestone_name)
may resemble the following:
#> # A tibble: 6 × 3
#> baseline bp_cfb2 bp_cfb4
#> <dbl> <dbl> <dbl>
#> 1 144. -24.7 -21.0
#> 2 148. -22.2 -42.5
#> 3 145. -13.4 -30.4
#> 4 148. -8.03 -31.2
#> 5 139. -28.1 -24.4
#> 6 133. -18.8 -18.3
In this example, the variables baseline
,
bp_cfb2
, and bp_cfb4
are observed at weeks 0,
2, and 4, respectively. Therefore, we will define
readout = c(baseline = 0, bp_cfb2 = 2, bp_cfb4 = 4)
in the
endpoint definition.
To simulate correlated values across time points, we assume that
baseline
, bp2
, and bp4
follow a
multivariate normal distribution. The user must provide the mean vector
and covariance matrix for these variables. Importantly, any custom
generator function must accept n
(number of observations)
as its first argument, as required by TrialSimulator, otherwise it
throws an error.
library(mvtnorm)
bp_generator <- function(n, bp_means, bp_vcov){
dat <- rmvnorm(n, mean = bp_means, sigma = bp_vcov) %>%
as.data.frame()
names(dat) <- c('baseline', 'bp2', 'bp4')
dat %>%
mutate(
bp_cfb2 = bp2 - baseline,
bp_cfb4 = bp4 - baseline
) %>%
select(baseline, bp_cfb2, bp_cfb4)
}
This function assumes a trivariate normal distribution for blood pressure at baseline, week 2, and week 4. Users may also extend this function to support more complex simulation logic as needed.
Defining Endpoints for an Arm
We now use the endpoint()
function to define the
longitudinal endpoints in a treatment arm. The required parameters for
the custom generator, bp_means
and bp_vcov
,
are passed through the ellipsis (...
) argument.
vcov1 <- matrix(
c(2, 1.5, 1,
1.5, 3, 1.5,
1, 1.5, 4),
nrow = 3
)
ep_in_trt1 <- endpoint(
name = c('bp_cfb2', 'baseline', 'bp_cfb4'),
type = rep('non-tte', 3),
readout = c(baseline = 0, bp_cfb4 = 4, bp_cfb2 = 2),
generator = bp_generator,
bp_means = c(140, 125, 120),
bp_vcov = vcov1
)
Note that we deliberately specify the name
and
readout
arguments in an order different from the data frame
returned by bp_generator
, to illustrate that the
endpoint()
function correctly maps variable names to their
respective time points. However, the user is responsible for ensuring
the order of name
and type
are aligned; the
function cannot automatically infer types from the generator output.
This example also demonstrates that endpoint()
can be
used to generate covariates. For instance, the baseline value is
typically used as a covariate in statistical models, and should
therefore be simulated and retained. More generally, users can define
covariates, biomarker dynamics, or subgroup indicators using the same
mechanism to simulate complex trial designs.
A simple way to generate a report for an endpoint object is to print it in console directly
ep_in_trt1
⚕⚕ Endpoint Name: bp_cfb2, baseline, bp_cfb4
⚕⚕ # of Endpoints: 3
Defining Endpoints for Another Arm
We now define another arm with slightly different means and correlation structure for blood pressure:
vcov2 <- matrix(
c(2, 1.5, 1,
1.5, 3, 1,
1, 1, 4),
nrow = 3
)
ep_in_trt2 <- endpoint(
name = c('bp_cfb2', 'baseline', 'bp_cfb4'),
type = rep('non-tte', 3),
readout = c(baseline = 0, bp_cfb4 = 4, bp_cfb2 = 2),
generator = bp_generator,
bp_means = c(140, 127, 122),
bp_vcov = vcov2
)
ep_in_trt2
⚕⚕ Endpoint Name: bp_cfb2, baseline, bp_cfb4
⚕⚕ # of Endpoints: 3
Note that user can even specify different generators across arms. For
example, bp_generator
can be replaced by another function
with different mechanism and arguments.
Creating Treatment Arms and Adding Endpoints
Finally, we create two treatment arms and attach the respective endpoints:
trt1 <- arm(name = 'treatment 1')
trt2 <- arm(name = 'treatment 2')
trt1$add_endpoints(ep_in_trt1)
trt2$add_endpoints(ep_in_trt2)
trt1
⚕⚕ Arm Name: treatment 1
⚕⚕ # of Endpoints: 3
⚕⚕ Registered Endpoints: bp_cfb2, baseline, bp_cfb4
trt2
⚕⚕ Arm Name: treatment 2
⚕⚕ # of Endpoints: 3
⚕⚕ Registered Endpoints: bp_cfb2, baseline, bp_cfb4