Wrapper Functions of Common Statistical Methods in TrialSimulator • TrialSimulator

The TrialSimulator package provides a unified set of wrapper functions that encapsulate statistical methods commonly used in clinical trial simulations. These functions facilitate model fitting, treatment comparisons, and covariate adjustments within a standardized interface.

When multiple active treatment arms are present, each wrapper function automatically performs pairwise comparisons between each active arm and the designated reference (e.g., placebo, control, or standard-of-care). All wrapper functions share a consistent syntax and output structure. Most of them support model specification via an R formula interface, and covariate adjustment is available where appropriate.

Pairwise average treatment effects (ATEs) are estimated using the emmeans package under the hood. All tests are one-sided, and the ellipsis (...) argument can be used to define data subsets, enabling flexible analyses such as those needed in enrichment designs.

Below is a summary of the available wrapper functions included in this vignette, along with their corresponding statistical methods, output metrics, and support for covariate adjustment.

Function	Method	Statistics in Outputs	Covariate adjustment
`fitLinear`	Linear model	ATE	Yes
`fitLogistic`	Logistic model	regression coefficient log odds ratio odds ratio risk ratio risk difference	Yes
`fitCoxph`	Cox PH model	log hazard ratio hazard ratio	Yes
`fitLogrank`	logrank test		No but supports strata
`fitFarringtonManning`	Farrington-Manning test		No

Example

To demonstrate the usage of the wrapper functions, we simulate a hypothetical three-arm trial with one control (pbo) and two active doses (low and high). The trial includes a continuous covariate x, three endpoint types (time-to-event, continuous, and binary), and a binary biomarker used to define subgroups.

The placebo arm is constructed as follows:

## time-to-event endpoint
pfs <- endpoint(name = 'pfs', type = 'tte', generator = rexp, rate = .07)
## continuous endpoint
cep <- endpoint(name = 'cep', type = 'non-tte', 
                readout = c(cep = 0), generator = rnorm)
## binary endpoint
bep <- endpoint(name = 'bep', type = 'non-tte', 
                readout = c(bep = 0), generator = rbinom, size = 1, prob = .1)

## biomarker
bm <- endpoint(name = 'biomarker', type = 'non-tte', 
               readout = c(biomarker = 0), generator = rbinom, 
               size = 1, prob = .7)

## covariate
covar <- endpoint(name = 'x', type = 'non-tte', 
                  readout = c(x = 0), generator = rnorm)

pbo <- arm(name = 'pbo')
pbo$add_endpoints(pfs, cep, bep, bm, covar)

For brevity, the code for the low and high dose arms and the trial definition are hidden in this vignette. Refer to the source of this vignette for full code. A single milestone for final analysis is defined, with an empty action doNothing. This is because we will explicitly request locked data outside of the action function when demonstrating the wrapper functions.

#> A milestone <final> is registered.

Now we execute the trial. After simulation, locked data can be retrieved using the get_locked_data() method with the milestone name "final".

controller$run(n = 1, plot_event = FALSE, silent = TRUE)
locked_data <- trial$get_locked_data('final')
head(locked_data)
#>   patient_id  arm enroll_time dropout_time       pfs pfs_event         cep
#> 1          1  pbo  0.00000000   322.817715  6.819525         1 -0.21028508
#> 2          2 high  0.03333333   408.104162 98.520770         1 -0.08806648
#> 3          3  low  0.06666667    45.651176 21.215875         1  2.12517509
#> 4          4  low  0.10000000   103.734510 27.046261         1  1.53260487
#> 5          5  pbo  0.13333333     8.685884  4.776078         1  0.71222847
#> 6          6 high  0.16666667    12.309588 12.309588         0  1.80510702
#>   cep_readout bep bep_readout biomarker biomarker_readout             x
#> 1           0   0           0         0                 0 -0.6829126592
#> 2           0   0           0         1                 0  2.2913063006
#> 3           0   1           0         1                 0  0.0001582446
#> 4           0   0           0         0                 0 -0.1193032317
#> 5           0   0           0         1                 0  1.5541601999
#> 6           0   1           0         1                 0  0.1176311038
#>   x_readout
#> 1         0
#> 2         0
#> 3         0
#> 4         0
#> 5         0
#> 6         0

table(locked_data$arm)
#> 
#> high  low  pbo 
#>  100  100  100

Analyze Time-to-Event Endpoint

We begin by analyzing the time-to-event endpoint pfs using both a Cox proportional hazards model and a log-rank test. When performing analysis on a subset defined via the ... argument, the syntax must be compatible with that of dplyr::filter().

## adjust for covariate x
fitCoxph(Surv(pfs, pfs_event) ~ arm + x, placebo = 'pbo', 
         data = locked_data, alternative = 'less', 
         scale = 'hazard ratio')
#>    arm placebo  estimate            p info         z
#> 1 high     pbo 0.4212846 7.451797e-08  171 -5.253747
#> 2  low     pbo 0.7270550 1.826582e-02  178 -2.090960

fitLogrank(Surv(pfs, pfs_event) ~ arm, placebo = 'pbo', 
           data = locked_data, alternative = 'less')
#>    arm placebo            p info         z
#> 1 high     pbo 4.327270e-08  171 -5.352922
#> 2  low     pbo 1.945612e-02  178 -2.065114

## more details
fitLogrank(Surv(pfs, pfs_event) ~ arm, placebo = 'pbo', 
           data = locked_data, alternative = 'less', tidy = FALSE)
#>    arm placebo            p info         z info_pbo info_trt n_pbo n_trt
#> 1 high     pbo 4.327270e-08  171 -5.352922       88       83   100   100
#> 2  low     pbo 1.945612e-02  178 -2.065114       88       90   100   100

## with strata
fitLogrank(Surv(pfs, pfs_event) ~ arm + strata(biomarker), placebo = 'pbo', 
           data = locked_data, alternative = 'less')
#>    arm placebo            p info         z
#> 1 high     pbo 1.118371e-07  171 -5.178502
#> 2  low     pbo 1.618737e-02  178 -2.139753

## analyze a subset
fitCoxph(Surv(pfs, pfs_event) ~ arm + strata(biomarker), placebo = 'pbo', 
         data = locked_data, alternative = 'less', 
         scale = 'log hazard ratio', 
         x > -2 & x < 3) ## define a subset
#>    arm placebo   estimate            p info         z
#> 1 high     pbo -0.8573491 1.459637e-07  167 -5.128585
#> 2  low     pbo -0.3321455 1.494187e-02  178 -2.171628

Analyze Continuous Endpoint

We analyze the continuous endpoint cep using linear models, with and without covariate adjustment.

## ATE accounting for covariate x
fitLinear(cep ~ arm * x, placebo = 'pbo', 
          data = locked_data, alternative = 'greater')
#> NOTE: Results may be misleading due to involvement in interactions
#> NOTE: Results may be misleading due to involvement in interactions
#>    arm placebo estimate            p info        z
#> 1 high     pbo 1.145315 1.487699e-14  200 8.206717
#> 2  low     pbo 1.260325 0.000000e+00  200 9.385532

## marginal model
fitLinear(cep ~ arm, placebo = 'pbo', 
          data = locked_data, alternative = 'greater')
#>    arm placebo estimate            p info        z
#> 1 high     pbo 1.141955 1.287859e-14  200 8.223983
#> 2  low     pbo 1.246252 0.000000e+00  200 9.329409

## analyze a sub-group
fitLinear(cep ~ arm, placebo = 'pbo', 
          data = locked_data, alternative = 'greater', 
          biomarker == 1) ## define the subgroup
#>    arm placebo estimate            p info        z
#> 1 high     pbo 1.233081 1.240319e-11  142 7.257133
#> 2  low     pbo 1.253416 1.585532e-11  133 7.253694

Analyze Binary Endpoint

We analyze the binary endpoint bep using logistic regression. Multiple estimands (e.g., odds ratio, risk ratio, risk difference) can be computed by specifying the scale argument.

## compute regression coefficient of arm
fitLogistic(bep ~ arm * x + biomarker, placebo = 'pbo', 
            data = locked_data, alternative = 'greater', 
            scale = 'coefficient')
#>    arm placebo  estimate            p info         z
#> 1 high     pbo 1.5617130 2.607897e-05  200 4.0457407
#> 2  low     pbo 0.4052616 1.706828e-01  200 0.9514706

## compute odds ratio (ATE)
fitLogistic(bep ~ arm + x*biomarker, placebo = 'pbo', 
            data = locked_data, alternative = 'greater', 
            scale = 'odds ratio')
#>    arm placebo estimate            p info         z
#> 1 high     pbo 4.519236 2.494409e-05  200 4.0561502
#> 2  low     pbo 1.434263 1.924932e-01  200 0.8687454

## compute risk ratio (ATE)
fitLogistic(bep ~ arm + x + biomarker, placebo = 'pbo', 
            data = locked_data, alternative = 'greater', 
            scale = 'risk ratio')
#>    arm placebo estimate            p info         z
#> 1 high     pbo 3.208045 5.888748e-05  200 3.8507119
#> 2  low     pbo 1.371446 1.894037e-01  200 0.8800958

The risk difference can also be estimated using logistic regression or the Farrington-Manning test. Note that the latter does not support covariate adjustment.

## compute risk difference (ATE)
fitLogistic(bep ~ arm + x * biomarker, placebo = 'pbo', 
            data = locked_data, alternative = 'greater', 
            scale = 'risk difference')
#>    arm placebo   estimate            p info         z
#> 1 high     pbo 0.25622322 7.567886e-06  200 4.3267032
#> 2  low     pbo 0.04246899 1.909160e-01  200 0.8745257

## compute risk difference without covariate
fitLogistic(bep ~ arm, placebo = 'pbo', 
            data = locked_data, alternative = 'greater', 
            scale = 'risk difference')
#>    arm placebo estimate            p info         z
#> 1 high     pbo     0.26 4.271057e-06  200 4.4511262
#> 2  low     pbo     0.04 2.071073e-01  200 0.8164994

## analyze a sub-group
fitLogistic(bep ~ arm, placebo = 'pbo', 
            data = locked_data, alternative = 'greater', 
            scale = 'risk difference', 
            x < 2 & biomarker != 1) ## define a subgroup
#>    arm placebo   estimate           p info         z
#> 1 high     pbo 0.27359617 0.005291046   58 2.5562045
#> 2  low     pbo 0.07465438 0.184003001   66 0.9002147

## analyze the same sub-group using the FM test,
## same estimate but different p-values
fitFarringtonManning(endpoint = 'bep', placebo = 'pbo', 
                     data = locked_data, alternative = 'greater', 
                     x < 2 & biomarker != 1)
#>    arm placebo   estimate           p info         z
#> 1 high     pbo 0.27359618 0.006345068   58 2.4923488
#> 2  low     pbo 0.07465438 0.188880249   66 0.8820302