Adaptive trials: A study reproducibility simulation approach (Jayawardana et al. 2025)

Adaptive trials allow for modifications along the course of the trials. Common modifications include sample size adjustment, dropping an arm (if futile), or even adjusting allocation probabilities. Group sequential design is one of the most utilized forms of adaptive designs allowing for multiple planned interim analyses with pre-specified rules for early stopping. Here, we take a simulation approach to understand adaptive designs.

Code

pacman::p_load(flextable, dplyr, future.apply, progressr)

Randomisation

For simplicity, we use 1:1 treatment allocation ratio using block randomisation of size 4. Uniform random numbers are used to to order treatment allocations thereby producing a sequential list of treatment allocations for consecutively recruited participants.

Code

#' Randomize participants to treatment arms
#'
#' @param n sample size
#' @param blocks randomisation blocks
#'
#' @returns a dataframe
#' @export
#'
#' @examples
simRandomisation <- function(n=584, blocks = 4){
  # blocking sequnce
  block <- rep(seq(1:n), each = blocks, length.out = n)
  # treatment sequence
  trt <- rep(0:1, length.out = n)
  # a random number on a unit interval
  set.seed(as.numeric(Sys.Date()))
  random <- runif(n)
  df <- data.frame(block, trt, random)
  df <- df[order(df$block, df$random),]
  df$obs <- 1:n
  df <- df[, c("obs", "trt")]
  return(df)
}

If we wanted to adjust allocation ratios:

Code

#' Randomize participants to treatment arms
#'
#' @param n sample size
#' @param blocks randomisation blocks
#' @param ratio randomization ratio
#' 
#' @returns a dataframe
#' @export
#'
#' @examples
simRandomisation <- function(n = 584, blocks = 4, ratio = c(1, 1)) {

  # 1. Define the possible treatment groups (0, 1, 2...) based on ratio length
  trts <- 0:(length(ratio) - 1)
  
  # 2. Create the sequence by sampling within each block
  # We use 'replicate' to generate each block independently
  num_blocks <- ceiling(n / blocks)
  
  result <- replicate(num_blocks, {
    sample(trts, size = blocks, replace = TRUE, prob = ratio)
  })
  
  # 3. Flatten the matrix into a vector and trim to length n
  trt_vector <- as.vector(result)[1:n]
  
  # 4. Return as a clean data frame
  return(data.frame(obs = 1:n, trt = trt_vector))
}

Participant accrual

We assume participant accrual is constant over time and would take 928 days.

Code

#' Simulate participant accrual
#'
#' @param n sample size
#' @param recruit_period accrual period in days
#'
#' @returns a vector
#' @export
#'
#' @examples
simAccrual <- function(n = 584, recruit_period = 928){
  # simulate trial times that patient enters the trial
  # adding 0.5 to ensure recruitment times > day 1 when rounded
  accrual_time <- round(runif(n) * recruit_period + 0.5)
  accrual_time <- sort(accrual_time)
  return(accrual_time)
}

If patient accrual was not constant over time, we’d consider a beta distribution

Code

#' Simulate participant accrual
#'
#' @param n sample size
#' @param recruit_period accrual period (days)
#' @param alpha alpha shape parameter
#' @param beta beta shape parameter
#'
#' @returnsa vector
#' @export
#'
#' @examples
simAccrual <- function(n = 584, recruit_period = 928, alpha = 1, beta = 1) {
  
  # 1. Generate random numbers using a Beta distribution
  # rbeta(n, shape1, beta) returns values between 0 and 1
  # alpha= beta = 1 is equivalent to runif (constant accrual)
  # alpha= 2, beta = 1 creates a "late peak" (accelerating accrual)
  # alpha= 1, beta = 2 creates an "early peak" (decelerating accrual)
  # alpha= 2, beta = 2 creates a "bell curve" (peak in middle)
  
  accrual_proportions <- rbeta(n, alpha, beta)
  
  # 2. Scale the 0-1 values to the actual recruit_period
  accrual_time <- round(accrual_proportions * recruit_period + 0.5)
  
  # 3. Sort to represent chronological entry
  accrual_time <- sort(accrual_time)
  
  return(accrual_time)
}

Generate participant outcomes

Choice of probability distribution depends on the outcome variable. You could use the binomial distribution for a binary outcome, normal distribution for a continuous outcome, or Weibull/exponential for time-to-event outcome.

Code

#' Simulate trial data with outcome variable
#'
#' @param n sample size
#' @param recruit_period recruitment period (days)
#' @param p vactor of event probabilities
#'
#' @returns a dataframe
#' @export
#'
#' @examples
simTrialData <- function(n = 584, recruit_period = 928, p){
  # simulate random allocations
  data <- simRandomisation()
  # simulate accrual
  data$accrual <- simAccrual()
  # simulate events from binomial dist
  data$event <- rbinom(n, 1, p[data$trt+1])
  
  return(data)
}

Identify data available at interim analysis

We assume outcome to be available immediately after recruitment and the time of interim analysis to be when a pre-determined number of events to be 20. As the data is ordered by accrual time, we can calculate cumulative number of events cum_events with cum_events=20 indicating the planned time of the interim analysis. Since we assume no time lag between recruitment and outcome assessment, the outcome at the interim (event_iterim: 0,1) would be the same as the outcome at final time point (event) for participants included in the interim analysis (i.e., for observations where obs <= interim_ind)

Code

#' Identify interim data  
#'
#' @param data a dataframe
#' @param events_at_interim number of events at interim
#'
#' @returns a dataframe
#' @export
#'
#' @examples
simInterimData <- function(data, events_at_interim = 20){
  # cumulative events
  data$cum_events <- cumsum(data$event)
  # when does interim occur
  data$interim_ind <- data$obs[min(which(data$cum_events == events_at_interim))]
  # events at interim
  data$event_interim <- with(data, ifelse(obs <= interim_ind, event, NA))
  
  return(data)
}

Analyze trial data

During interim analysis we seek to find out:

if trial would have stopped before maximum recruitment
point estimates and their CIs for treatment effect
sample size at the point of stopping the trial
if the study would have found evidence of clinically relevant treatment effect or if it was a futile trial

Code

#' Analyze trial data
#'
#' @param data trial dataframe
#' @param alpha_interim alpha spend at interim analysis
#' @param alpha_final alpha spend at final analysis
#'
#' @returns a dataframe
#' @export
#'
#' @examples
analyseData <- function(data, alpha_interim, alpha_final){
  nevents0 <- sum(data$event[data$trt == 0])
  nevents1 <- sum(data$event[data$trt == 1])
  pevents0 <- nevents0/sum(data$trt ==0)
  pevents1 <- nevents1/sum(data$trt ==1)
  
  # logistic reg at interim
  data$event_interim <- factor(data$event_interim)
  logit_interim <- glm(event_interim ~ trt, data = data, family = "binomial")
  conf_int <- confint(logit_interim)
  
  # results at interim
  
  interim_or <- exp(coef(logit_interim)["trt"])
  interim_lci <- exp(conf_int["trt", "2.5 %"])
  interim_uci <- exp(conf_int["trt", "97.5 %"])
  interim_p <- coef(summary(logit_interim))["trt", "Pr(>|z|)"]
  
  # stop for trial success at interim
  interim_stop <- ifelse(interim_p < alpha_interim, 1, 0)
  
  # the proportion of events if the trial stopped at interim
  
  if(interim_stop == 1){
    nevents0 <- sum(data$event[data$trt == 0 & !is.na(data$event_interim)])
    nevents1 <- sum(data$event[data$trt == 1 & !is.na(data$event_interim)])
    pevents0 <- nevents0/sum(data$trt == 0 & !is.na(data$event_interim))
    pevents1 <- nevents1/sum(data$trt == 1 & !is.na(data$event_interim))
  }
  # final analysis
  data$event <- factor(data$event)
  logit <- glm(event ~ trt, data = data, family = "binomial")
  conf <- confint(logit)
  
  # results at final analysis
  final_or <- exp(coef(logit)["trt"])
  final_lci <- exp(conf["trt", "2.5 %"])
  final_uci <- exp(conf["trt", "97.5 %"])
  final_p <- coef(summary(logit))["trt", "Pr(>|z|)"]
  
  final_stop <- ifelse(final_p < alpha_final, 1, 0)
  
  # is the trial conclusive?
  stop <- ifelse(interim_stop == 1, interim_stop, final_stop)
  
  # sample size (used to determine average sample size)
  
  if(interim_stop == 1){
    sample_size <- unique(data$interim_ind)
  } else{
    sample_size <- nrow(data)
  }
  
  # determine trial flip-flop probability
  flipflop <- ifelse(interim_stop == 1 & final_stop == 0, 1, 0)
  
  results <- data.frame(
    nevents0, nevents1, pevents0, pevents1, sample_size,
    interim_time = unique(data$interim_ind),
    interim_or, interim_lci, interim_uci, interim_p, interim_stop,
    final_or, final_lci, final_uci, final_p, final_stop, stop, flipflop
  )
  return(results)
}

Avoid in model.matrix.default(mt, mf, contrasts) : variable 1 has no levels errors

Code

#' Analyze trial data
#'
#' @param data trial dataframe
#' @param alpha_interim alpha spend at interim analysis
#' @param alpha_final alpha spend at final analysis
#'
#' @returns a dataframe
#' @export
#'
#' @examples
#' 


analyseData <- function(data, alpha_interim, alpha_final){
  
  # Helper to check if GLM can run (needs variation in both Y and X)
  is_estimable <- function(df, outcome_col) {
    if (nrow(df) == 0) return(FALSE)
    has_outcome_var <- length(unique(df[[outcome_col]])) > 1
    has_trt_var     <- length(unique(df$trt)) > 1
    return(has_outcome_var && has_trt_var)
  }

  # --- Initial Calculations (Total Trial potential) ---
  nevents0 <- sum(data$event[data$trt == 0], na.rm = TRUE)
  nevents1 <- sum(data$event[data$trt == 1], na.rm = TRUE)
  pevents0 <- nevents0 / sum(data$trt == 0)
  pevents1 <- nevents1 / sum(data$trt == 1)
  
  # --- Interim Analysis ---
  # Filter data to only those available at interim for the model
  interim_data <- data[!is.na(data$event_interim), ]
  interim_estimable <- is_estimable(interim_data, "event_interim")
  
  if(interim_estimable){
    logit_interim <- glm(factor(event_interim) ~ trt, data = interim_data, family = "binomial")
    interim_p     <- coef(summary(logit_interim))["trt", "Pr(>|z|)"]
    conf_int      <- confint.default(logit_interim) 
    interim_or    <- exp(coef(logit_interim)["trt"])
    interim_lci   <- exp(conf_int["trt", 1])
    interim_uci   <- exp(conf_int["trt", 2])
  } else {
    interim_p <- interim_or <- interim_lci <- interim_uci <- NA
  }
  
  # Stop for trial success at interim
  interim_stop <- ifelse(!is.na(interim_p) && interim_p < alpha_interim, 1, 0)
  
  # Update event stats if the trial stopped at interim
  if(interim_stop == 1){
    nevents0 <- sum(interim_data$event[interim_data$trt == 0])
    nevents1 <- sum(interim_data$event[interim_data$trt == 1])
    pevents0 <- nevents0 / sum(interim_data$trt == 0)
    pevents1 <- nevents1 / sum(interim_data$trt == 1)
    sample_size <- nrow(interim_data)
  } else {
    sample_size <- nrow(data)
  }
  
  # --- Final Analysis ---
  final_estimable <- is_estimable(data, "event")
  
  if(final_estimable){
    logit    <- glm(factor(event) ~ trt, data = data, family = "binomial")
    final_p  <- coef(summary(logit))["trt", "Pr(>|z|)"]
    conf     <- confint.default(logit)
    final_or  <- exp(coef(logit)["trt"])
    final_lci <- exp(conf["trt", 1])
    final_uci <- exp(conf["trt", 2])
  } else {
    final_p <- final_or <- final_lci <- final_uci <- NA
  }
  
  final_stop <- ifelse(!is.na(final_p) && final_p < alpha_final, 1, 0)
  
  # --- Logic Summary ---
  stop     <- ifelse(interim_stop == 1, 1, final_stop)
  flipflop <- ifelse(interim_stop == 1 & final_stop == 0, 1, 0)
  
  # Build the final data frame
  results <- data.frame(
    nevents0, 
    nevents1, 
    pevents0, 
    pevents1, 
    sample_size,
    interim_time = max(data$interim_ind, na.rm = TRUE), # Assuming this is a time-point or ID
    interim_or, 
    interim_lci, 
    interim_uci, 
    interim_p, 
    interim_stop,
    final_or, 
    final_lci, 
    final_uci, 
    final_p, 
    final_stop, 
    stop, 
    flipflop
  )
  
  return(results)
}

Simulate a single trial

Code

#' Run a single experiment
#'
#' @param n sample size
#' @param recruit_period accrual period
#' @param p vector of treatment success probability
#' @param events_at_interim number of events at interim
#' @param alpha_interim interim alpha spend
#' @param alpha_final final alpha spend
#'
#' @returns a dataframe
#' @export
#'
#' @examples
runTrial <- function(n, recruit_period, p, events_at_interim, alpha_interim, alpha_final){
  #1. simulate trial data
  df <- simTrialData(n, recruit_period, p)
  #2. create interim data
  df <- simInterimData(df, events_at_interim)
  #3. Analyze data
  results <- analyseData(df, alpha_interim, alpha_final)
  return(list(data = df, results = results))
}

Code

# inputs
inputs <- data.frame(
  seed = as.numeric(Sys.Date()),
  # recruitment period = Days in 2.5 years   
  #584/690 is the target/total possible = efficiency ratio
  recruit_period = 365.25 * 3 * 584/ 690,
  n = 584,
  events_at_interim = 20,
  p0 = 0.10,
  p1 = 0.04,
  # p = c(p0, p1)
  alpha_final = 0.045,
  alpha_interim = 0.005
  
)
p <- c(inputs$p0, inputs$p1)

dict <- data.frame(
  variable = c(
    'nevents0','nevents1', 'pevents0', 'pevents1', 'sample_size', 'interim_time',
    "interim_or", "interim_lci", "interim_uci", "interim_p", "interim_stop",
    "final_or", "final_lci", "final_uci", "final_p", "final_stop", "stop",
    "flipflop"
  ),
  label = c(
    "Number of events in control group",
    "Number of events in experimental group",
    "Proportion of events in control group",
    "Proportion of events in experimental group",
    "Sample size",
    "Time (days) at interim",
    "Odds ratio at interim",
    "Lower CI for OR at interim",
    "Upper CI for OR at interim",
    "P-value of any difference in treatments at interim",
    "Whether the trial would have stopped at interim",
    "Odds ratio at final analysis",
    "Lower CI for odds ratio",
    "Upper CI for odds ratio",
    "P-value for treatment difference at final analysis",
    "Whether the trial is conclusive at final analysis",
    "Whether the trial was conclusive (at final or interim)",
    "The probability of trial flip-flopping"
    
    
  )
)

Code

set.seed(inputs$seed)

results_singleTrial <- runTrial(
  inputs$n, inputs$recruit_period, p, inputs$events_at_interim,
  inputs$alpha_interim, inputs$alpha_final
)
results <- results_singleTrial$results|> round(3)  |> t() |> as.data.frame() 
names(results)[1] <- "value"
results$variable <- rownames(results)
rownames(results) <- NULL
results <- left_join(results, dict)

Joining with `by = join_by(variable)`

Code

results <- results[, c("variable", "label", "value")]

Code

tab_singleT <- flextable(results) |> theme_vanilla() 
tab_singleT

Table 1: Results from a single simulated trial

variable	label	value
nevents0	Number of events in control group	16.000
nevents1	Number of events in experimental group	4.000
pevents0	Proportion of events in control group	0.140
pevents1	Proportion of events in experimental group	0.030
sample_size	Sample size	249.000
interim_time	Time (days) at interim	249.000
interim_or	Odds ratio at interim	0.187
interim_lci	Lower CI for OR at interim	0.061
interim_uci	Upper CI for OR at interim	0.577
interim_p	P-value of any difference in treatments at interim	0.004
interim_stop	Whether the trial would have stopped at interim	1.000
final_or	Odds ratio at final analysis	0.305
final_lci	Lower CI for odds ratio	0.141
final_uci	Upper CI for odds ratio	0.661
final_p	P-value for treatment difference at final analysis	0.003
final_stop	Whether the trial is conclusive at final analysis	1.000
stop	Whether the trial was conclusive (at final or interim)	1.000
flipflop	The probability of trial flip-flopping	0.000

Simulate multiple trials

Code

runMultipleTrials <- function(nsims, seed, n, recruit_period, p, 
                              events_at_interim, alpha_interim, alpha_final){
  # random seed
  seeds <- seed + seq(1: nsims)
  # simulate
  multiple_trials <- lapply(1:nsims, function(x){
    set.seed(seeds[x])
    y <- runTrial(n, recruit_period, p, events_at_interim, alpha_interim,
                  alpha_final)
    return(y)
  })
  
  # summarise
  results <- lapply(multiple_trials, function(x) return(x$results))
  results_all <- do.call(rbind, results)
  # saveRDS(multiple_trials, file = "results_multTrials.rds")
  
  # remove any trials with non-estimable CIs and calculate summary
  x <- which(apply(results_all, 1, function(x) any(is.na(x))))
  if(length(x) > 0){
    result_summary <- apply(results_all[-x,], 2, summary)
  } else{
    result_summary <- apply(results_all, 2, summary)
  }
  
  return(
    list(
      results_all = results_all,
      results_summary = results_summary,
      seeds = seeds
    )
  )
}

Parallelize computations using future.apply}

Code

library(future.apply)
library(progressr)

runMultipleTrials_Parallel <- function(nsims, seed, n, recruit_period, p_param, 
                                       events_at_interim, alpha_interim, alpha_final) {
  
  # 1. Setup Parallel Backend
  # Using 'multisession' is the most stable across Windows/Mac/Linux
  plan(multisession, workers = parallelly::availableCores() - 1)
  
  # Ensure cleanup of workers when function finishes or crashes
  on.exit(plan(sequential))
  
  seeds <- seed + seq_len(nsims)
  
  # 2. Access the progressor from the calling environment
  p <- progressor(steps = nsims)
  
  # 3. Run Simulations
  multiple_trials <- future_lapply(1:nsims, function(i) {
    # Increment progress bar
    p(message = sprintf("Simulating trial %g", i))
    
    set.seed(seeds[i])
    
    # Run the individual trial
    # Note: I renamed your 'p' to 'p_param' to avoid conflict with progressor 'p'
    y <- runTrial(n, recruit_period, p_param, events_at_interim, alpha_interim, alpha_final)
    
    return(y)
  }, future.seed = TRUE) 
  
  # 4. Summarize Results
  results <- lapply(multiple_trials, function(x) x$results)
  results_all <- do.call(rbind, results)
  
  # Clean up non-estimable rows (NAs from glm failures)
  results_clean <- results_all[complete.cases(results_all), ]
  result_summary <- apply(results_clean, 2, summary)
  
  return(list(
    results_all = results_all,
    results_summary = result_summary,
    seeds = seeds
  ))
}

Print results

Code

tab_mulT <- results_mulT$results_summary |> round(3) |> t() |> as.data.frame()
tab_mulT$variable <- rownames(tab_mulT)
rownames(tab_mulT) <- NULL
names(tab_mulT) <- c("Minimum", "Q1", "Median", "Mean", "Q3", "Maximum", "variable")
tab_mulT <- left_join(tab_mulT, dict)

Joining with `by = join_by(variable)`

Code

tab_mulT <- tab_mulT |> select(variable, label, everything())
names(tab_mulT) <- stringr::str_to_sentence(names(tab_mulT))

Code

t_mulT <- flextable(tab_mulT) |> theme_vanilla() 
t_mulT

Table 2: Results from a multiple simulated trial

Variable	Label	Minimum	Q1	Median	Mean	Q3	Maximum
nevents0	Number of events in control group	12.000	24.000	28.000	27.582	32.000	47.000
nevents1	Number of events in experimental group	1.000	9.000	11.000	11.037	14.000	25.000
pevents0	Proportion of events in control group	0.043	0.088	0.099	0.101	0.112	0.254
pevents1	Proportion of events in experimental group	0.005	0.030	0.039	0.039	0.047	0.082
sample_size	Sample size	138.000	584.000	584.000	554.664	584.000	584.000
interim_time	Time (days) at interim	116.000	243.000	281.000	286.926	327.000	565.000
interim_or	Odds ratio at interim	0.000	0.252	0.368	0.408	0.518	1.626
interim_lci	Lower CI for OR at interim	0.000	0.083	0.135	0.152	0.201	0.650
interim_uci	Upper CI for OR at interim	0.302	0.760	1.009	Inf	1.340	Inf
interim_p	P-value of any difference in treatments at interim	0.000	0.014	0.052	0.141	0.175	1.000
interim_stop	Whether the trial would have stopped at interim	0.000	0.000	0.000	0.098	0.000	1.000
final_or	Odds ratio at final analysis	0.059	0.286	0.371	0.391	0.473	1.304
final_lci	Lower CI for odds ratio	0.014	0.135	0.184	0.194	0.240	0.651
final_uci	Upper CI for odds ratio	0.241	0.606	0.750	0.793	0.937	2.653
final_p	P-value for treatment difference at final analysis	0.000	0.001	0.006	0.042	0.032	0.994
final_stop	Whether the trial is conclusive at final analysis	0.000	1.000	1.000	0.804	1.000	1.000
stop	Whether the trial was conclusive (at final or interim)	0.000	1.000	1.000	0.805	1.000	1.000
flipflop	The probability of trial flip-flopping	0.000	0.000	0.000	0.001	0.000	1.000

References

Jayawardana, Kaushala S, Michael Dymock, Robert K Mahar, Julie A Marsh, and Katherine J Lee. 2025. “A Practical Guide to Simulation for an Adaptive Trial Design with a Single Interim Analysis.” Trials 26 (1): 400.

Citation

BibTeX citation:

@online{okola2026,
  author = {Okola, Basil},
  title = {Adaptive {Methods} for {Clinical} {Trials}},
  date = {2026-01-12},
  url = {https://bokola.github.io/posts/2026-01-12-Adaptive-designs/},
  langid = {en}
}

For attribution, please cite this work as:

Okola, Basil. 2026. “Adaptive Methods for Clinical Trials.” January 12, 2026. https://bokola.github.io/posts/2026-01-12-Adaptive-designs/.