Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
Only Sessions at Date / Time 
 
 
Session Overview
Session
Poster Exhibition: T / Tuesday posters at Biozentrum
Time:
Tuesday, 26/Aug/2025:
11:30am - 1:00pm

Location: Biozentrum, 2nd floor

Biozentrum, 2nd floor poster area

Show help for 'Increase or decrease the abstract text size'
Presentations
posters-tuesday-BioZ: 1

Advantages and pitfalls of a multi-centre register collecting long-term real-world data on medical devices: Insights from a cochlear implant registry

Karin A. Koinig, Magdalena Breu, Jasmine Rinnofner, Stefano Morettini, Ilona Anderson

MED-EL Medical Electronics, Austria

Background

There is a need for real world data (RWD) to demonstrate how medical devices function outside the setting of clinical studies and over longer time periods. One way to address this, is to establish registries collecting data from routine clinical visits. Here we present our experience from evaluating pre-surgery to two years post-surgery data from a multicentre cochlear implant registry.

Methods:

Data were extracted in anonymized form from a registry covering five clinics. The medical devices studied were cochlear implants, which help individuals with severe to profound sensorineural hearing loss (deafness) to regain their hearing. Key outcomes included speech perception, wearing time of the implant, self-perceived auditory benefit, self-reported quality of life, and safety results.

Results

The registry provided extensive data but revealed differences in clinical practices, which made summarizing data across different assessments a challenge. Not all clinics collected the same information, although a minimal measurement data set was specified in the registry protocol. For example, the methods used to assess speech perception varied between centres, including differences in noise levels and test formats. In addition, we observed a high dropout rate, which represents a possible bias: Particularly at long-term follow-up visits, those with more problems seemed more likely to return to the clinic, while those with fewer problems were more likely to be adequately cared for by the outpatient clinics and therefore more likely to be lost to follow-up. Overall, this resulted in a substantial amount of missing data, which was difficult to explain to regulatory bodies like the FDA and TÜV. To address this issue, we presented demographics and outcomes with and without the patients lost to follow-up.

Conclusion

RWD are valuable but pose a challenge when collected in routine clinical practice, as the diversity of assessments and tests leads to different reporting standards and data gaps that make it difficult to obtain homogeneous and usable data. Statisticians must work with the study team to develop clear and transparent strategies for data collection and data extraction to achieve consistent and reliable results from registries.



posters-tuesday-BioZ: 2

Development and validation of prognostic models in phase I oncology clinical trials

Maria Lee Alcober1,2, Guillermo Villacampa1, Klaus Langohr2

1Statistics Unit, Vall d'Hebron Institute of Oncology (Spain); 2Department of Statistics and Operations Research, Universitat Politècnica de Catalunya (Spain)

Phase I trials are an essential part of the development of oncology research. For patients, balancing the potential risks of toxicity against the benefits of investigational drugs is crucial. Consequently, participation in phase I trials requires a minimum life expectancy and the absence of relevant symptoms. However, in clinical practice, no objective measures are used to evaluate these criteria, and decisions rely on subjective judgment. Considering this background, this study aims to use different statistical methods to develop and validate prognostic models to better identify oncology patients who may benefit from early-phase clinical trials.

A total of 921 patients treated at the Vall d’Hebron Institute of Oncology from January 2011 to November 2024 were included in this study (799 in the development cohort and 122 in the validation cohort). Different strategies were used to develop the prognostic models: i) stratified Cox's proportional hazards models, ii) stratified Cox models enhanced with restricted cubic splines to address non-linearity, and iii) machine learning techniques such as decision trees and random survival forests to capture complex interactions. Risk scores derived from these models provide interpretable summaries of patient risk profiles, facilitating practical clinical use.

Results were validated using i) internal validation employing bootstrapping and cross-validation and ii) external validation using an independent dataset. Model performance was evaluated through discrimination (C-statistic), calibration (calibration plots and the Hosmer-Lemeshow test), overall performance (Brier score), and clinical utility (decision curve analysis).

Internal validation consistently outperformed external validation across all performance metrics, particularly in calibration and clinical utility. Among the models, random survival forests achieved the highest C-statistic, demonstrating superior discrimination. Conversely, incorporating restricted cubic splines into the Cox's proportional hazards model did not notably improve the evaluated metrics.

This work offers a replicable framework for deriving and validating risk scores that improve precision in patient selection for phase I trials. Future efforts will focus on formalising calibration methods and comparing these models and scores with other published prognostic tools using external validation.



posters-tuesday-BioZ: 3

Application of Bayesian surrogacy models to select primary endpoint in phase 2 based on relationship to a phase 3 endpoint

Alexandra Jauhiainen1, Enti Spata2, Patrick Darken3, Carla A. Da Silva4

1R&I Biometrics and Statistical Innovation, Late Respiratory & Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden; 2R&I Biometrics and Statistical Innovation, Late Respiratory & Immunology, BioPharmaceuticals R&D, AstraZeneca, Cambridge, UK; 3R&I Biometrics and Statistical Innovation, Late Respiratory & Immunology, BioPharmaceuticals R&D, AstraZeneca, Gaithersburg, US; 4Early Respiratory and Immunology Clinical Development, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden

Background

A key goal of treatment in asthma is to prevent episodes of severe symptom worsening called exacerbations. Designing trials for these relatively rare events is a challenge, especially in the early phases of development of new therapies, as the studies tend to be large and lengthy. Hence, exacerbations are not usually studied as a primary endpoint until phase 3. Alternative endpoints to use in early phase trials of shorter duration can be lung function measurements like FEV1, or the novel endpoint CompEx, which is a composite endpoint enriching exacerbations by adding events defined from deteriorations in diary card variables.

Methods

All three endpoints; FEV1, CompEx, and exacerbations; were evaluated using patient level data across a set of 14 trials with 27 treatment comparisons. FEV1 was analysed as change from baseline, while CompEx and exacerbations were modelled both in a time-to-first and recurrent event setting, across two timeframes (3- and 12-months duration).

Bayesian bivariate random-effect meta-analysis was applied to estimate the total correlation for treatment effects for FEV1 and CompEx with exacerbations. Bayesian surrogacy analysis within the Daniels & Hughes framework was applied across treatment comparisons to evaluate the trial-level relationship between CompEx and exacerbations.

Results

The change from baseline in FEV1 at 3 months had a weak correlation with the preferred phase 3 endpoint, the rate ratio for exacerbations at 12 months, and showed limitations in its ability to quantify the effect reported on exacerbations across drug modalities.

In contrast, the CompEx hazard ratio at 3 months correlated well with the 12-month rate ratio observed on exacerbations. CompEx was confirmed as a surrogate in terms of predicting treatment effects observed on exacerbations, with a high level of correspondence between the endpoints across modalities and asthma severities.

Conclusion

FEV1 remains an important respiratory endpoint, especially for drugs with bronchodilating properties, but has limitations as a primary phase 2 endpoint across modalities when the aim is to target exacerbations in phase 3.

CompEx has an increased event frequency compared to exacerbations alone, especially noticeable in populations with low exacerbation rates (mild/moderate asthma). This makes CompEx an attractive endpoint to use in design of early phase trials across a range of modalities, especially towards the milder spectrum of disease, substantially reducing sample sizes needed.

This research was funded by AstraZeneca.



posters-tuesday-BioZ: 4

Discontinuation and attrition rates in phase II or phase III first-line randomized clinical trials (RCTs) of solid tumors

Virginia Delucchi1, Chiara Molinelli2, Luca Arecco2, Andrea Boutros3, Davide Soldato2, Matteo Lambertini2,3, Dario Trapani4,5, Bishal Gyawali6, Gabe S Sonke7, Sarah R Brown8, Mattew R Sydes9,10, Luca Boni1, Saskia Litiere11, Eva Blondeaux1

1U.O. Epidemiologia Clinica, IRCCS Ospedale Policlinico San Martino, Genova, Italy; 2U.O.C. Clinica di Oncologia Medica, IRCCS Ospedale Policlinico San Martino, Genova, Italy; 3Department of Internal Medicine and Medical Specialties, University of Genova, Genova, Italy; 4Division of New Drugs and Early Drug Development for Innovative Therapies, European Institute of Oncology, IRCCS, Milan 20141, Italy; 5Department of Oncology and Hemato-Oncology, University of Milan, Milan 20122, Italy; 6Division of Cancer Care and Epidemiology, Cancer Research Institute, Queen's University, Kingston, ON, Canada; 7Division of Medical Oncology, Netherlands Cancer Institute, Amsterdam, the Netherlands; 8Leeds Cancer Research UK Clinical Trials Unit, University of Leeds, Leeds, UK; 9BHF Data Science Centre, HDR UK, London, UK; 10Data for R&D, Transformation Directorate, NHS England, London, UK; 11EORTC Headquarters, Brussels, Belgium

Background

Differential discontinuation and attrition rates in randomized controlled trials (RCTs) bias efficacy assessments, potentially leading to misinterpretations of treatment effects. Despite their critical role, the extent and implications of these rates in cancer trials remain unclear. We aimed to systematically quantify discontinuation and attrition rates in RCTs of solid tumors and how variation in these rates might impact the estimated treatment effect on overall survival.

Methods

A systematic review of published literature was carried out to identify phase II or phase III RCTs of first-line treatments published from Jan‑2015 to Feb‑2024 of solid tumors in Medline. Reported treatment discontinuation and post-study treatments figures were extracted from CONSORT diagram and/or text. Attrition was computed as the percentage of patients reported as discontinuing study drugs for whom a post-study treatment was not documented. We investigated differences in discontinuation and attrition rates according to type of cancer, sponsor and trial phase. Discontinuation and attrition by treatment arm were not reported due to potential influence of experimental treatment on progression. Simulations evaluating the impact of different discontinuation and attrition rates on overall survival will be implemented and presented at the congress.

Results

Out of 22,141 records screened, 533 trials met the inclusion criteria. The majority (56%) were phase III, industry-sponsored (54%) trials; 126 (24%) trials enrolled patients with non-small cell lung cancer, 79 (15%) breast cancer, 53 (10%) colorectal cancer, 40 (8%) other gastrointestinal cancers, 29 (5%) melanoma, 28 (5%) pancreatic cancer and 178 other tumor types. Treatment discontinuation figures were reported in 415 (78%) trials, with a median patient discontinuation rate of 83%. No difference in the patient’ treatment discontinuation rate was observed according to sponsor and trial phase. Among the 415 trials reporting patient’ treatment discontinuation, data on any post-study treatment was reported in 220 (53%) trials. Median patient attrition rate was 37%. The highest median patient attrition rate was observed for urothelial cancer trials (53%) and the lowest for breast cancer trials (28%). Industry-sponsored trials reported a higher median patient attrition rate than academic trials (38% vs 26%, respectively). No difference in patient attrition rate was observed between phase II and phase III trials.

Conclusions

Although most cancer trials published on treatment discontinuation rates, post-study treatments were less frequently documented. Our results highlight the need to improve the reporting of these figures to ensure transparency, reliability, and accurate assessment of treatment effects on long-term outcome measures.



posters-tuesday-BioZ: 5

Enhancing Dose Selection in Phase I Cancer Trials: Extending the Bayesian Logistic Regression Model with Non-DLT Adverse Events Integration

Luca Genetti, Andrea Nizzardo, Marco Pergher

Evotec - Verona, Italy

This work presents the Burdened Bayesian Logistic Regression Model (BBLRM), an enhancement to the Bayesian Logistic Regression Model (BLRM) for dose-finding in phase I oncology trials. Traditionally, the BLRM determines the maximum tolerated dose (MTD) based on dose-limiting toxicities (DLTs)1. However, clinicians often perceive model-based designs like BLRM as complex and less conservative than rule-based designs, such as the widely used 3+3 method2,3. To address these concerns, the BBLRM incorporates non-DLT adverse events (nDLTAEs) into the model. These events, although not severe enough to qualify as DLTs, provide additional information suggesting that higher doses might result in DLTs.

In the BBLRM, an additional parameter δ is introduced to account for nDLTAEs. This parameter adjusts the toxicity probability estimates, making the model more conservative in dose escalation without compromising the accuracy in allocating the true MTD. The δ parameter is derived from the proportion of patients experiencing nDLTAEs and is tuned based on the design characteristics to balance the model’s conservatism. This approach aims to reduce the likelihood of assigning toxic doses as MTD while involving clinicians more directly in the decision-making process identifying the nDLTAEs along the study conduction.

The work includes a simulation study comparing BBLRM with more traditional versions of BLRM4,5 and a two stage Continual Reassessment Method (CRM)6 that incorporates nDLTAEs across various scenarios. The simulations demonstrate that BBLRM significantly reduces the selection of toxic doses as MTD without compromising the accuracy of MTD identification. These results suggest that integrating nDLTAEs into the dose-finding process can enhance the safety and acceptance of model-based designs in phase I oncology trials.

References:

1. Neuenschwander B et al. Critical aspects of the bayesian approach to phase I cancer trials. Statistics in Medicine 2008.

2. Love SB et al. Embracing model-based designs for dose-finding trials. British Journal of Cancer 2017.

3. Kurzrock R et al. Moving beyond 3+3: the future of clinical trial design. American Society of Clinical Oncology Educational Book 2021.

4. Zhang H et al. Improving the performance of Bayesian logistic regression model with overdose control in oncology dose-finding studies. Statistics in Medicine 2022.

5. Ghosh D et al. Hybrid continuous reassessment method with overdose control for safer dose escalation. Journal of Biopharmaceutical Statistics 2023.

6. Iasonos A et al. Incorporating lower grade toxicity information into dose finding designs. Clinical Trials 2011.



posters-tuesday-BioZ: 6

Bayesian Inference of the Parametric Piecewise Accelerated Failure Time Models for Immune-oncology Clinical Trials

XINGZHI XU, SATOSHI HATTORI

Osaka University, Japan

Modeling delayed treatment effects pose significant challenges in survival analysis, particularly in immune-oncology trials where Kaplan-Meier curves often exhibit overlapping patterns. Overlapping Kaplan-Meier curves implies the proportional hazard assumption is violated and the use of hazard ratio to summarize treatment effects is not appealing. In addition, it implies some patients do not benefit from the immuno-oncology drug. To address these issues, Sunami and Hattori (2024) introduced the piecewise Accelerated Failure Time (pAFT) model, employing a frequentist semi-parametric maximum-likelihood approach to account for delayed treatment effects and to evaluate each patient's probability of receiving benefit from the treatment. Their framework, while innovative, faced challenges in handling complex treatment-by-covariates interactions.

Building on their foundational work, this paper introduces two Bayesian parametric extensions: the pAFT model and the interactive piecewise Accelerated Failure Time (ipAFT) model. The Bayesian framework enhances the original model by incorporating prior knowledge and improving parameter estimation precision. The ipAFT model, in particular, extends the methodology by explicitly modeling treatment-by-covariates interactions, offering deeper insights into treatment efficacy on different subgroups.

Comprehensive simulation studies demonstrate that the proposed Bayesian models perform exceptionally in capturing delayed treatment effects, achieving accurate estimations and reliable coverage probabilities even with small sample sizes. The ipAFT model provides two measures for patient-specific treatment effects: probabilities of receiving the benefit from the treatment and patient-specific benefit after the delayed time. Applying some multivariate analysis techniques (such as hierarchical clustering) to the two measures, we can effectively characterize patients' treatment effects. Application to a real-world immuno-oncology clinical trial dataset reveals distinct patient subgroups based on the result of the ipAFT model.

By addressing key limitations of traditional survival models and extending Sunami and Hattori’s pAFT framework, the proposed Bayesian models offer flexible tools for analyzing immuno-oncology clinical trials. The stable and flexible natures allow our methods to be useful in early-phase clinical trials with small patient counts.



posters-tuesday-BioZ: 7

Bayesian power-based sample size determination for single-arm clinical trials with time-to-event endpoints

Go Horiguchi1, Isao Yokota2, Satoshi Teramukai1

1Department of Biostatistics, Graduate School of Medical Science, Kyoto Prefectural University of Medicine, Japan; 2Department of Biostatistics, Hokkaido University Graduate School of Medicine, Japan

Introduction

Single-arm exploratory trials are widely used in early-phase oncology research to assess the potential of new treatments, often using time-to-event endpoints. Conventional sample size calculations under a frequentist framework typically rely on limited statistics, such as point estimates of survival rates at specific time points or a single hazard ratio (HR). By contrast, Bayesian methods can incorporate prior information and allow interim decisions with greater flexibility. We propose a Bayesian sample size determination method based on posterior and prior predictive probabilities of the hazard ratio, introducing analysis and design priors to improve decision-making accuracy and efficiency.

Methods

In our Bayesian design, we set a target hazard ratio of 1 to show the superiority of new treatment. Using the analysis prior, we compute the posterior probability that the hazard ratio is below this target. If this probability exceeds a prespecified threshold, we conclude efficacy and stop the trial. For each candidate sample size, we draw from the design prior, generate predicted outcomes under proportional hazards, and calculate the proportion of simulated trials that would meet the stopping criterion. This proportion is the Bayesian power. The smallest sample size achieving the desired power is then selected. Here, the analysis prior encodes historical knowledge about the parameter, while the design prior represents its uncertainty at the planning stage.

Results

Simulation results show that more informative analysis priors reduce sample size, while greater uncertainty in the design priors increases it. For designs without interim analysis, the Bayesian method produces sample sizes comparable to or smaller than frequentist methods while maintaining type I error rates. Interim analyses reduce expected sample size and trial duration, with thresholds for posterior probabilities influencing early termination probabilities. Results also demonstrate flexibility in accommodating varying assumptions about survival distributions and parameter uncertainties.

Conclusion

The proposed Bayesian sample size determination method efficiently incorporates prior information and interim analyses, making it a practical alternative to traditional frequentist approaches. This approach enables flexible and rational trial designs, reducing conflicting decisions and improving resource use. Limitations include reliance on the proportional hazards assumption and computational demands for simulation-based power calculations. Future research should explore extensions to handle censoring and other complexities in clinical trials.



posters-tuesday-BioZ: 8

Calibration of dose-agnostic priors for Bayesian dose-finding trial designs with multiple outcomes

Emily Alger1, Shing M. Lee2, Ying Kuen K. Cheung2, Christina Yap1

1The Institute of Cancer Research, United Kingdom; 2Columbia University, USA

Introduction: The goal of dose-finding oncology trials is to assess the safety of novel anti-cancer treatments across multiple doses and to recommend dose(s) for subsequent trials. Based on previous observed responses, trialists dynamically recommend new doses for further investigation during the trial.

Adaptive decision making lends itself to Bayesian learning, with Bayesian frameworks increasingly guiding dose recommendations in model-based dose-finding designs, such as the Continual Reassessment Method (CRM) design. However, these approaches often add complexity by incorporating multiple outcomes and require appropriate prior selection. For trialists who lack prior knowledge, we may look to adopt a dose-agnostic prior – with each dose equally likely to be the a priori optimal dose. However, applying existing methodology to a multiple-outcome CRM may inflate suboptimal, low dose recommendations.

Methods: We broaden calibration techniques for single-outcome trial designs to calibrate dose-agnostic priors for multiple-outcome trial designs, such as designs that jointly evaluate Dose Limiting Toxicities (DLTs) and efficacy responses, or DLTs and patient-reported outcomes (PROs). The a priori probability each dose is identified as the recommended dose is written analytically and optimised using divergence minimisation. A simulation study is presented to demonstrate the effectiveness of calibrated priors for both the PRO-CRM[1] trial design and the joint-outcome CRM model proposed by Wages and Tait[2] in comparison to marginally calibrated priors.

Results: Our analytical and computationally efficient technique maintains an a priori dose agnostic prior whilst improving the probability of correct selection (PCS) and standard deviation of PCS across most simulation scenarios. Thus, jointly calibrated priors reduce the bias present in simulation performance with marginally calibrated priors.

Conclusion: Leveraging analytical expressions for a priori optimal dose recommendations enables computationally efficient implementation and reduces the need for extensive simulations to confirm trial design performance. What’s more, this approach supports trialists to develop deeper intuition about their prior choices, thus strengthening their confidence in selecting robust and suitable priors. As Bayesian dose-finding trial designs continue to advance, research and guidance on the effective calibration of design parameters is essential to support the uptake of Bayesian designs, demonstrate the importance of rigorous prior calibration, and ensure optimal performance in practice.

[1] Lee, Shing M., Xiaoqi Lu, and Bin Cheng. "Incorporating patient‐reported outcomes in dose‐finding clinical trials." Statistics in medicine 39.3 (2020): 310-325.

[2] Wages, Nolan A., and Christopher Tait. "Seamless phase I/II adaptive design for oncology trials of molecularly targeted agents." Journal of biopharmaceutical statistics 25.5 (2015): 903-920.



posters-tuesday-BioZ: 9

Estimands in platform trials with time - treatment interactions

ZIYAN WANG, Dave Woods

Statistical Sciences Research Institute (S3RI), University of Southampton, United Kingdom

Background
In long-running platform trials, treatment effects may change over time due to shifts in the recruited population or changes in treatment efficacy—such as increased clinician experience with a novel surgical technique [1]. Most existing studies have assumed equal time trends across treatment arms and controls, focusing on treatment-independent time effects [2,3]. However, when time trends are unequal between treatment arms and controls, the standard estimands can lead to inflated type I error rates, reduced statistical power, and biased treatment effect estimates. In this study, we propose a novel model-based estimand designed to correct for unequal time trends, thereby ensuring robust and accurate inference in platform trials.

Methods
We propose a general model-based estimand based on a time-averaged treatment effect that is adaptable to a variety of time trend patterns in platform trials. In our study, we compare the performance of the standard treatment effect estimand with our generalized estimand in settings where time trends differ between treatment arms and the control. A simulation study is conducted within the framework of Bayesian platform trials—including those employing response-adaptive randomization (RAR)—and performance is evaluated in terms of error rates, bias, and root mean squared error.

Results
Our findings demonstrate that the generalized estimand is robust across various time trend patterns, including nonlinear trends. Flexible modelling with this estimand maintains unbiasedness and reduces power loss compared to the standard estimand. Moreover, the approach remains effective under adaptive randomization rules. All simulation analyses were performed using our “BayesianPlatformDesignTimeTrend” R package, which is publicly available on CRAN.

Conclusion
This work provides a practical and innovative approach for addressing time trend effects in platform trials, offering new insights into the analysis of trials where unequal strength of time trends exists.

[1] K. M. Lee, L. C. Brown, T. Jaki, N. Stallard, and J. Wason. Statistical consideration when adding new arms to ongoing clinical trials: the potentials and the caveats. Trials, 22:1–10, 2021.

[2] Roig, M. B., Krotka, P., Burman, C.-F., Glimm, E., Gold, S. M., Hees, K., Jacko, P., Koenig, F., Magirr, D., Mesenbrink, P., et al. (2022). On model-based time trend adjustments in platform trials with non-concurrent controls. BMC medical research methodology. 22.1, pp. 1–16.

[3] Marschner, I. C., & Schou, I. M. (2024). Analysis of Nonconcurrent Controls in Adaptive Platform Trials: Separating Randomized and Nonrandomized Information. Biometrical Journal, 66(6), e202300334.



posters-tuesday-BioZ: 10

A Graphical Approach to Subpopulation Testing in Biomarker-Driven Clinical Trial Design

Boaz Natan Adler1, Valeria Mazzanti2, Pantelis Vlachos2, Laurent Spiess2

1Cytel Inc., United States of America; 2Cytel Inc., Geneva, Switzerland

Introduction:

As targeted therapies in Oncology are fast-becoming commonplace, clinical studies are increasingly focused on biomarker-driven hypotheses. This type of research, in turn, requires methods for subpopulation analysis and multiplicity comparison procedures (MCPs) for sound clinical trials. In our case study, we employed advanced statistical software to design and optimize such a clinical study with a novel graphical approach to testing sequence and procedures.

Methods:

For this optimization exercise, we interrogated the typical areas of design interest: selecting an appropriate sample size, required number of events, and the timing and attributes of an interim analysis for the study. In addition, our optimization aim included a focus on the testing sequence of the study’s subpopulations, biomarker-positive, and -negative, as well as a test of the overall study population. We also sought to optimize the MCP employed for the study, examining logrank and stepdown logrank tests, alongside different options for alpha splitting among the tests. Design variations and simulation were conducted using advanced statistical software and relied on a graphical approach to testing sequence and alpha splitting, in addition to visualizations of other study parameter variations.

Results:

This extensive simulation and optimization work allowed us to select a design that was tailored to the unique treatment effect assumptions of the investigational drug. We were able to convey design tradeoffs and the implications of testing sequence selection and other key design parameters in a graphical, relatable manner to the entire drug development team.

Conclusion:

A graphical approach to designing complex subpopulation analysis-driven clinical trials enables biostatisticians to assess design tradeoffs and selections clearly, while easing design and simulation work, and enhancing communication with governance committees.



posters-tuesday-BioZ: 11

Optimizing Biomarker-Based Enrichment strategies in clinical trials

Djuly Asumpta PIERRE PAUL1, Irina Irincheeva2, Hong Sun3

1Nantes University (France), Bristol-Myers Squibb (Switzerland); 2Bristol-Myers Squibb Boudry (Switzerland); 3Bristol-Myers Squibb Boudry (Switzerland)

Background

Identifying patients’ groups based on biomarkers is crucial in oncology. Validating a biomarker as a stratification criterion in clinical trials can take several years. Choosing the threshold for continuous biomarkers is particularly challenging, often relying on a limited number of values evaluated with simplistic statistical approaches. Early dichotomization ignores the actual distribution of values and the potentially informative “grey zone”.

Methods

In this work, we adapt a biomarker enrichment design to identify the optimal threshold to determine patients who will benefits the most from the experimental treatment. We simulate Simon &Simon design for binomial endpoint and survival endpoint. Various scenarios of chosen thresholds are studied through simulations inspired by existing studies. ROC curve-based approach to determine the threshold, as well as the Song-Chi closed test procedure to assess the treatment effect in both the overall population and the biomarker-positive subgroups are explored.

Results

Initial results suggest that our proposal effectively controls the Type I error for both binomial and survival endpoints. Additionally, switching to a ROC-curve approach for estimating the biomarker threshold improves statistical power by approximately 14%. Furthermore, incorporating the Song-Chi method allows testing of the difference in treatment effects between the standard control group and the experimental group in both the overall population (all the patients enrolled in the trial) and among the biomarker-positive patients, the patients most likely to benefit from the treatment. This method maintains rigorous Type I error control while still ensuring adequate power. Moreover, it facilitates the detection of treatment-specific fluctuations and subgroup dynamics within these two populations, leading to a more nuanced and precise analysis.

Conclusion

In conclusion, this study highlights the importance of a more nuanced approach in selecting biomarker thresholds and improving biomarker enrichment strategy for clinical trials, which is essential to accelerate the development of personalized therapies while optimizing the efficiency of clinical trials.

Reference

Simon N, Simon R. Adaptive enrichment designs for clinical trials.

Biostatistics. 2013 Sep;14(4):613-25. doi: 10.1093/biostatistics/kxt010.

Epub 2013 Mar 21. PMID: 23525452; PMCID: PMC3769998.

Song Y, Chi GY. A method for testing a prespecified subgroup in clinical trials

Stat Med. 2007 Aug 30;26(19):3535-49.

doi: 10.1002/sim.2825.PMID: 17266164



posters-tuesday-BioZ: 12

Leveraging Synthetic Data for Enhanced Clinical Research Outcomes

Szymon Musik1,2, Agnieszka Kowalewska3, Gianmarco Gallone3, Jacek Zalewski3, Joanna Sasin-Kurowska3

1Late Phase Global Clinical Data Management, Clinical Data & Insights, BioPharmaceutical Clinical Operations, R&D, AstraZeneca, Warsaw, Poland; 2Department of Education and Research in Health Sciences, Medical University of Warsaw, Poland; 3Clinical Programming, Clinical Data & Insights, BioPharmaceutical Clinical Operations, R&D, AstraZeneca, Warsaw, Poland

Background / Introduction: In recent years, the pharmaceutical industry has been under immense pressure to make drug development faster and more efficient. Traditional clinical trials often face obstacles like high costs, prolonged durations, and challenges in participant recruitment, particularly for rare diseases. Additionally, testing of programming tools, databases, and software before acquiring patient data is cumbersome. Synthetic Data in Clinical Trials (SDCT) offers an innovative solution by providing high-quality, clinically realistic datasets that meet strict privacy conditions, facilitating thorough research.

Methods: We developed AstraZeneca’s Study Synthetic Data Tool (SYNDATA), which generates synthetic data for a study (referred to as the target study) using its Architect Loader Spreadsheet (ALS) and data from an ongoing or completed study (referred to as the base study). Importantly, the target study may not yet have any data collected. Our pipeline leverages the event chronology specified by the ALS, allowing scenarios for each patient to be created before data generation. We categorize dataset variables into groups based on types, such as dates or binary options (e.g., Yes/No), and use designated methods for generating these variables. This approach employs classic statistical techniques like kernel density estimation and Bayesian networks. Designed primarily for study set-up testing, SYNDATA explores potential variable values in the target study while preserving relationships from the base study. It can also incorporate incorrect values into the data if necessary.

Results: Incorporating synthetic data into clinical trials has significantly improved data scarcity challenges. SYNDATA generates synthetic data as soon as the ALS for a study is available, enabling users to test programming tools, databases, software, and visualizations. Furthermore, synthetic data supports data science projects. SYNDATA is secure and ensures patient privacy.

Conclusion: Synthetic data is set to transform clinical trials by addressing the current challenges in the pharmaceutical industry. It reduces development timelines and enhances data integration efficiency, allowing more reliable trial simulations. Adopting synthetic data as a vital component of clinical research could reshape conventional practices and usher in a new era of data-driven drug development.



posters-tuesday-BioZ: 13

Graph-Based Integration of Heterogeneous Biological Data for Precision Medicine: A Comparative Analysis of Neo4j and MySQL

Byoung Ha Yoon

KRIBB(korea research institute of bioscience and biotechnology), Korea, Republic of (South Korea)

Precision medicine aims to provide personalized treatment plans tailored to individual patients. However, the complexity and scale of biomedical data, coupled with the exponential growth of clinical knowledge derived from diverse biological databases and scientific publications, pose significant challenges in clinical applications. A key challenge in this context is understanding and integrating the intricate relationships between heterogeneous biological data types.

In this study, we address this challenge by integrating multiple biological datasets—such as protein-protein interactions, drug-target associations, and gene-disease relationships—into a unified graph database. The constructed graph consists of approximately 150,000 nodes and 100 million relationships, with data pre-processed to remove redundancies. To assess the suitability of graph-based databases for handling complex biological networks, we compared the performance of Neo4j, a state-of-the-art graph database, with MySQL, a traditional relational database. Our results demonstrate that while MySQL struggled with complex queries involving multiple joins, Neo4j exhibited superior performance, providing rapid responses to the same queries.

These findings emphasize the potential of graph databases for efficiently storing and querying complex biological relationships. Moreover, the interconnected nature of biological data in graph structures facilitates the application of computational biology techniques, such as network analysis and clinical biostatistics, to uncover hidden patterns and infer new insights. This approach not only enhances the understanding of biological systems but also holds promise for improving clinical decision-making and advancing the field of precision medicine.



posters-tuesday-BioZ: 14

Revolutionizing Clinical Data Management: A Strategic Roadmap for Integrating AI/ML into CDM

Joanna Magdalena Sasin-Kurowska1, Szymon Musik1, Mariusz Panczyk2

1Astra Zeneca, Poland; 2Medical University of Warsaw, Poland

Clinical Data Management (CDM) is essential in clinical research, ensuring the accuracy and integrity of data for regulatory submissions. As clinical trials become more complex and generate larger volumes of data—especially in Phase III trials—there is a growing need for advanced tools to manage and analyze this information. This poster highlights key findings from our research on integrating Artificial Intelligence (AI) and Machine Learning (ML) into CDM, transforming it into Clinical Data Science (CDS). By reviewing literature from 2008 to 2025, we identified emerging trends such as the use of Natural Language Processing (NLP) to analyze unstructured data, AI/ML for automating data cleaning and analysis, and new technologies like blockchain, wearable devices, and patient-centric approaches. Our results indicate that AI/ML can improve data quality, automate processes, and enhance predictive analytics, offering a more efficient and scalable solution for clinical research. We also present a roadmap for successfully integrating AI/ML into CDM to drive innovation and advance clinical research. This review emphasizes the need for a strategic, multidisciplinary approach to fully leverage these technologies for more efficient and accurate clinical trials.



posters-tuesday-BioZ: 15

Strategies to scale up model selection for analysis of proteomic datasets using multiple linear mixed-effect models

ILYA POTAPOV, MATTHEW DAVIS, ADAM BOXALL, FRANCESCO TUVERI, GEORGE WARD, SIMONE JUELIGER, HARPREET SAINI

Astex Pharmaceuticals, United Kingdom

Linear mixed-effect models (LMEM) are a key tool to model biomedical data with dependencies. For example, longitudinal read-outs from patients would necessarily need to address the correlation between samples, which violates the assumption of independence in the standard linear modelling approach. Designing the LMEM in terms of factors and their interaction that constitute the model is an elaborate process that takes into account both the formal analysis of the model variance and the end points of the study. Whereas there are multiple hypotheses of how best to design LMEMs, this process takes place normally at the level of a single model. In biomedical applications, however, we are often interested in multiple comparisons. In this case, the LMEM design process should be scaled up to optimise the model design for all comparisons simultaneously. In this work, we considered an example of the multiple design problem in a proteomic experiment. We showed how a general framework for the multiple LMEM designs can be established via the analysis of variance of the full and restricted (nested) models. This analysis included the formation of the P-value distribution for each of the factor terms and subsequent analysis of that distribution. We also demonstrated that the multiple design framework necessarily poses a question of whether all the models should have the same universal model design or individualised tailored models per protein. Both pathways are possible from the methodological point of view, yet they may have different implications for statistical inference. We discuss these implications.



posters-tuesday-BioZ: 16

Cost-utility analysis of sodium-glucose cotransporter-2 inhibitors on chronic kidney disease progression in diabetes patients: a real-world data in Thailand

Sukanya Siriyotha1, Amarit Tansawet2, Oraluck Pattanaprateep1, Tanawan Kongmalai3, Panu Looareesuwan1, Junwei Yang1, Suparee Wisawapipat Boonmanunt1, Gareth J McKay4, John Attia5, Ammarin Thakkinstian1

1Department of Clinical Epidemiology and Biostatistics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand; 2Department of Research and Medical Innovation, Faculty of Medicine Vajira Hospital, Navamindradhiraj University, Bangkok, Thailand; 3Division of Endocrinology and Metabolism, Faculty of medicine Siriraj Hospital Mahidol University, Bangkok, Thailand; 4Centre for Public Health, School of Medicine, Dentistry and Biomedical Sciences, Queen’s University Belfast, Belfast, United Kingdom; 5School of Medicine and Public Health, and Hunter Medical Research Institute, University of Newcastle, New Lambton, New South Wales, Australia

Introduction and Objective(s): Type 2 diabetes (T2D) increases the risk of micro- and macro-vascular complications, including chronic kidney disease (CKD), a major burdens that could significantly impair the quality of life and socioeconomic status. Evidence from numerous clinical trials demonstrate the benefits of sodium-glucose co-transporter 2 inhibitors (SGLT2is) in CKD prevention. However, the high cost of SGLT2i may limit their accessibility, despite economic evaluations suggesting cost-effectiveness. Therefore, this study aims to conduct a cost-utility analysis using real-world data in Thailand to provide more realistic and relevant evidence for policy decisions.

Method(s) and Results: Clinical and cost data of CKD patients between 2012 and 2022 were retrieved from Ramathibodi T2D data warehouse. Markov model was constructed for the following states: CKD stage 3, 4, 5, and death. A cost-utility analysis that estimates the cost per quality-adjusted life year (QALY) between two interventions: non-SGLT2i versus SGLT2i was performed in societal perspective. The incremental cost-effectiveness ratio (ICER) was calculated by dividing the difference in costs between the compared treatments by the difference in QALY associated with each treatment. A total of 20,735 patients were recruited. The lifetime costs were US$72,234.98 and 74,887.31 in patients with renal replacement therapy (RRT) and US$71,638.41 and 74,749.86 in patients without RRT, for non-SGLT2i and SGLT2i, respectively. ICERs were US$955.40 and 1,114.56 per QALY in patients with and without RRT.

Conclusions: SGLT2i was associated with higher treatment cost compared with non-SGLT2i. However, SGLT2i was still cost-effective considering Thailand willingness to pay at US$4,651 per QALY.

Keywords: Cost-utility analysis (QALY), Real-world data, Type 2 diabetes (T2D), Chronic kidney disease (CKD), Sodium-glucose co-transporter 2 inhibitors (SGLT2is)

References:

[1] Beckman JA, Creager MA. Vascular Complications of Diabetes. Circulation Research. 2016;118(11):1771-85.

[2] Wanner C, Inzucchi SE, Lachin JM, Fitchett D, von Eynatten M, Mattheus M, et al. Empagliflozin and Progression of Kidney Disease in Type 2 Diabetes. N Engl J Med. 2016;375(4):323-34.

[3] Reifsnider OS, Kansal AR, Wanner C, Pfarr E, Koitka-Weber A, Brand SB, et al. Cost-Effectiveness of Empagliflozin in Patients With Diabetic Kidney Disease in the United States:



posters-tuesday-BioZ: 17

Comparing the Safety and Effectiveness of Covid-19 Vaccines administered in England using OpenSAFELY: A Common Analytic Protocol

Martina Pesce1, Christopher Wood1, Helen McDonald2, Frederica Longfoot1, Venexia Walker3, Edward PK Parker4, William J Hulme1

1Bennett Institute for Applied Data Science, Nuffield Department of Primary Care Health Science, Oxford University, UK; 2University of Bath, UK; 3Population Health Sciences, Bristol Medical School, University of Bristol, UK; 4London School of Hygiene and Tropical Medicine, UK

Background

In England, Covid-19 vaccination campaigns have been delivered in Spring and Autumn each year since 2021, and this pattern is set to continue for the foreseeable future. At least two vaccine products are used each campaign to mitigate any potential unforeseen supply or safety issues.

Post-authorisation evaluations of these vaccines in routine, out-of-trial settings are crucial: incidence of longer-term and rarer outcomes are often not reliably estimable in trials, and vaccines may perform differently in more diverse population groups or in the context of newer viral variants.

The regularity and similarity of campaigns, including future campaigns, coupled with the availability of reliable routinely-collected health data on who is getting which vaccine and when, provides an opportunity to specify a single analysis protocol that can be reused across multiple campaigns.

Methods

We developed a Common Analytic Protocol to compare the safety and effectiveness of vaccine products used in each Covid-19 vaccination campaign. Planned analyses will use the OpenSAFELY research platform which provides secure access to routinely-collected health records for millions of people in England.

The protocol uses complementary approaches to control for confounding (one-to-one matching without replacement and inverse probability of treatment weighting) to compare products for a variety of safety and effectiveness endpoints, within a variety of population subgroups, and with various accompanying sensitivity analyses and balance checks. The analogous hypothetical randomised trial that the design emulates is also described.

All design elements are specified explicitly in R scripts, fully executable against simulated dummy data before any real data is available for analysis.

Discussion

The ability to plan analyses comparing vaccine products well in advance of the delivery of the campaign has numerous benefits and challenges, which will be described in this talk. We invite feedback on the proposed design prior to its use in real data.



posters-tuesday-BioZ: 18

Statistical requirements in medical diagnostic development across the UK, US, and EU markets: A review of regulation, guidelines and standards.

Timothy Hicks1,2, Joseph Bulmer1, Alison Bray1, Jordan L. Oakley3, Rachel L. Binks2,3, Kile Green2, Will S. Jones4, James M.S. Wason3, Kevin J. Wilson3

1Newcastle Upon Tyne Hospitals NHS Foundation Trust, United Kingdom; 2NIHR HealthTech Research Centre in Diagnostic and Technology Evaluation, United Kingdom; 3Newcastle University, United Kingdom; 4Centre of Excellence for Data Science, Artificial Intelligence and Modelling (DAIM), University of Hull, United Kingdom

Background: When developing novel medical diagnostic devices, including In Vitro Diagnostics, Medical Diagnostic Software, and General Medical Devices, developers must conform to their chosen markets’ regulations. In developing novel statistical methods to support diagnostic development, such as the use of adaptive design for sample size reassessment, it is paramount that the regulations, and associated guidance, do not preclude the proposed novel methodology. This review of legislation, official policy guidance, and standards across the UK, EU, and US aimed to identify regulatory requirements or restrictions relating to statistical methodology for diagnostics development.

Methods: Data sources identified for legislation, official policy guidance, and standards included: EUR-Lex, WestLaw UK, US Food and Drug Administration (FDA), Lexis+, Policy Commons, Medical Device Co-ordination Group (MDCG), and the British Standards Online Library. These data sources were searched for records relating to medical diagnostic development. Search terms included: Medical Device, In Vitro Diagnostic, Medical Diagnostic, Diagnostic, and IVD. Identified records were double screened for inclusion, including a within document search for 25 key terms related to statistical requirements and diagnostic development. Identified terms were coded and relevant statistical requirements both mandatory and recommended, extracted.

Results: This systematic review identified 2479 potential records, 540 of which met the inclusion criteria for data extraction, of which 139 had statistical requirements or recommendations related to medical diagnostic development. Mandatory requirements for specific tests or conditions were identified across the three regions (Total: n =187, UK = 12, EU = 82, US = 93). Examples of requirements include minimum sample sizes and specific populations when demonstrating diagnostic accuracy in certain high-risk conditions. For example, the EU Common Technical Specifications require first line assays for anti-HIV1/2 to include ≥400 positive HIV-1 and ≥100 positive HIV-2 specimens, of which 40 are non-B subtypes, and 25 are ‘same day’ fresh serum. Whilst not mandatory, this review also identified recommendations for best practice in diagnostic development and trial design covering: evidence requirements, statistical validity, study design, and study management.

Conclusion: Whilst mandatory statistical requirements exist for high-risk areas, thereby limiting the potential benefit of an adaptive trial due to mandating sample sizes, there remains a great opportunity for the development of novel methodologies and adaptive trial designs in medical diagnostics. This review will allow future development of a framework for designing adaptive trial in medical diagnostics, empowering statisticians and developers to improve efficiency whilst meeting regulatory requirements.



posters-tuesday-BioZ: 19

Calf muscle development in NICU graduates compared with typically developing babies: an analysis of growth trajectories using linear mixed models

Alana Cavadino1, Sian Williams2,3, Malcolm Battin4, Ali Mirjalili5, Louise Pearce6, Amy Mulqueeney4, N. Susan Stott7

1Epidemiology & Biostatistics, Faculty of Medical and Health Sciences, University of Auckland, New Zealand; 2Curtin School of Allied Health, Faculty of Health Sciences, Curtin University, Australia; 3Liggins Institute, University of Auckland, New Zealand; 4Newborn Services, Starship Child Health, Auckland District Health Board, New Zealand; 5Department of Anatomy and Medical Imaging, Faculty of Medical and Health Sciences, University of Auckland, New Zealand; 6Auckland Children’s Physiotherapy, Auckland, New Zealand; 7Department of Surgery, Faculty of Medical and Health Sciences, University of Auckland, New Zealand

Background / Introduction

Preterm birth and Neonatal Intensive Care Unit (NICU) admission are related to adverse health consequences in early childhood and beyond. This study evaluated lower leg muscle growth and motor development in the first 12 months of life in NICU graduates compared to typically developing (TD) infants.

Methods

A prospective, longitudinal study of infants born in Auckland, New Zealand, without complications and recruited from the community (TD), or discharged from a NICU (classed as intermediate-risk (NICU-IR) or higher-risk (NICU-HR) based on additional risk factors for adverse neurodevelopmental outcomes. Muscle volume and gross motor development were assessed at term-corrected ages 3-, 6- and 12-months (±1 month). Linear mixed models with REML and Kenward-Roger small-sample adjustment were used to estimate trajectories in Triceps Surae muscle volume measurements (Medial Gastrocnemius, Lateral Gastrocnemius, Soleus, and total Triceps Surae). Models included random intercepts for individuals and slopes for term-corrected-age, and fixed effects for term-corrected-age (months), body side (left/right leg), group (TD/NICU-IR/NICU-HR), and sex. Non-linear terms and interactions (by-group and by-side) for term-corrected-age, and different variance-covariance structures were evaluated. Estimated group trajectories and marginal means at 3-, 6- and 12-months term-corrected-age were presented.

Results

Sixty-one infants were recruited; n=24 TD, n=14 NICU-IR, and n=23 NICU-HR. NICU infants had lower birthweight (1.7±0.9kg) and length (40.3±6.2cm) compared to TD infants (3.3±0.5kg; 51.1±2.8cm). COVID-19 restrictions meant some 6- and 12-month assessments occurred late, with variable timings. For muscle volume measures, there were significant term-corrected-age*group and (term-corrected-age)2*group interactions, indicating muscle growth trajectories over time differed by group (Medial Gastrocnemius, Lateral Gastrocnemius, Triceps Surae, p<0.001; Soleus, p=0.04). Negative correlations between random intercepts and slopes indicated lower muscle volume at 3-months term-corrected-age was associated with faster growth. Between 3-12 months term-corrected-age, Triceps Surae increased on average by 18.1cm3 (95%CI: 16.1-20.2cm3), 13.3cm3 (10.6-16.0cm3) and 12.5cm3 (10.5-14.6cm3) in TD, NICU-IR, and NICU-HR infants, respectively. Soleus was smaller at 6- and 12-months term-corrected-age for both NICU groups, and Lateral Gastrocnemius was smaller at 12-months, term-corrected-age, for NICU-HR (p<0.001). At 12-months term-corrected-age, raw Gross Motor Quotient scores were lower for NICU-HR (p=0.005), and <10% of NICU infants were walking compared to 30% of TD.

Conclusion

Failure of typical Soleus growth over the first year contributed to a smaller Triceps Surae at 12-months term-corrected-age in NICU graduates. These findings add to the increasing body of evidence for an adverse impact of preterm birth and NICU stays on infant skeletal muscle growth.



posters-tuesday-BioZ: 20

Automating Report Generation with Stata: A Case Study of NORUSE

Maria Elstad

Helse Stavanger, Norway

Abstract

The Norwegian Service User Registry (NORUSE) is a comprehensive health registry utilized by Norwegian municipalities to document service recipients with substance abuse and/or mental health issues. The primary goal of NORUSE is to gather knowledge about the extent of services and the expected demand for services for this patient group. This data supports the formulation of municipal substance abuse policies, better decision-making regarding prioritization of user groups, and improved evaluation of service offerings. Nationally, the statistics contribute to the data foundation for shaping national policies for mental health and substance abuse work.

In 2024, we generated 64 automated municipality reports using VBA code in Excel. However, we have begun exploring the use of the Stata command putdocx for creating these reports. We are already using this for subgroup analysis, regional and national reports. This exploration highlights the potential of putdocx to streamline the process of generating detailed and consistent reports. Although we have also considered other software like Power BI, we found it less flexible compared to Stata, despite its superior graphing capabilities.

By employing putdocx, we can automate the creation of reports, which is particularly beneficial for municipalities that receive community-specific reports shortly after data collection. Additionally, Helse Stavanger produces regional and national reports, further leveraging the efficiency of automated report generation. The integration of putdocx in our reporting workflow enhances the accuracy and timeliness of data presentation, supporting better decision-making and policy formulation.

As we consider employing this method more broadly, we anticipate significant improvements in our ability to provide clear snapshots of users' situations based on the latest contact status. This tool contributes significantly to the ongoing efforts to improve service delivery for individuals with substance abuse and mental health challenges. The flexibility and scalability of putdocx make it a promising solution for our future reporting needs.



posters-tuesday-BioZ: 21

Maternal Mortality Rate in Sudan 2020: Causes of Death, Obstetric Characteristics and Territorial Disparity, Using Statistical Analysis.

MOHAMMED ABDU MUDAWI

freelancer (1Senior Statistician, Health Information System and Biostatistics Specialist) (Health (Health Information System))

Abstract

Maternal mortality in general is deaths associated with pregnancy. Maternal mortality is one of crucial of social determents of health and sociodemographic to measure and evaluate the quality of health care services (Antenatal Care Services), and reflects to the strength of health system in general, although Sudan was among the first countries in the Arab and Africa region, which conducted the (demographical health survey 1989, safe motherhood survey 1990 , Sudan household health survey 2006 & 2010 and multiple indicators cluster survey 2014 ), but the last survey that conducted and included the maternal mortality rate was been in 2010 and the maternal mortality rate was been 216 per 100000 live birth, due the instability situation (Sudanese Revolution 2018 and political situation), the sixth multiple indicators cluster survey (MICS 6th) 2018 was not conducted.

The paper focus and illustrates the estimation the Maternal Mortality Rate - MMR in Sudan by causes of diseases, place of deaths, obstetric characteristics and territorial disparity, the data were collected from the Federal Ministry of Health (Annual Statistical Report and Maternal Mortality Deaths Surveillance ) for the year of 2020.

The Maternal Mortality Rate – MMR in the country was (278.7 per 100000 live birth) in 2020, and the higher Maternal Mortality Rate in the East Darfur state was (1531.8 per 100000 live birth), and most maternal deaths was happened due the Obstetric Hemorrhage by (35%), and (45%) of maternal deaths in age between (20 - 30), 508 deaths (62%) happened out of antenatal care and ANC follow up services. West Kordofan state was most state registered the maternal deaths by (10% of deaths of all states), and the most maternal deaths was happened in health facilities by (82% of deaths according of place) it was more than deaths at home and in road. The Maternal Mortality Rate higher in 2020 than the last in SHHS survey 2010 was been 216 per 100000 live birth.



posters-tuesday-BioZ: 22

Community-Based Health Screening Attendance and All-Cause Mortality in Rural South Africa: A Causal Analysis

Faith Magut1, Stephen Olivier1, Ariane Sessego1, Lusanda Mazibuko1, Jacob Busang1, Dickman Gareta1,6, Kobus Herbst1,5, Kathy Baisely3,1, Mark Siedner1,2,4

1Africa Health Research Institute (AHRI), South Africa; 2Massachusetts General Hospital, Boston, Massachusetts, United States of America; 3London School of Hygiene & Tropical Medicine, Keppel Street, London, UK; 4University of KwaZulu-Natal, Durban, South Africa; 5DSI-SAMRC South African Population Research Infrastructure Population Infrastructure Network, Durban, South Africa; 6Institute of Social and Preventive Medicine, University of Bern, Bern, Switzerland

Background

South Africa is moving from a period marked by high mortality from HIV and tuberculosis(TB) to one characterised by a growing burden of non-communicable diseases. Community health fairs help to diagnose and refer individuals with chronic diseases in underserved areas. However, their impact on morbidity and all-cause mortality is unknown.

Methods

We enrolled individuals 15 years and older in the Africa Health Research Institute Health Demographic Health Surveillance area in rural KwaZulu-Natal to a community-based health fair screening and referral program (Vukuzazi). Testing was performed for HIV, TB, hypertension and diabetes. Those with positive results were visited at home for results provision and referral to local clinics.

All Individuals in the area were followed longitudinally through routine household surveillance to detect deaths. We used directed acyclic graphs to identify the following confounders of the association between health-fair attendance and mortality: age, sex, educational attainment, employment, household socio-economic status and prior healthcare-seeking behavior.

To estimate the effect of Vukuzazi health fair attendance on all-cause mortality, we first estimated inverse probability of treatment weights (IPTW) for health fair attendance, then applied weighted Kaplan-Meier analysis to compare survival and weighted Cox regression to estimate hazard ratios and marginal risk differences. We conducted a sensitivity analysis where we excluded deaths due to external factors (e.g. injuries) that would not be expected to be prevented by health fair attendance

Results

A total of 18,041 individuals (50.0% of those eligible) attended Vukuzazi health fairs. Compared to non-attenders, attenders were more likely to be women (68% vs 49%), older (median 37 vs 31 years), unemployed (37% vs 20%) and more likely to have accessed health care in the past year (53% vs 33%). Individuals were observed after health fairs for a median of 4.0 years (IQR 3.7 - 4.2 years) comprising a total of 127,625 person years. The crude mortality rate was 12.14(11.54-12.76) per 1000 person years. In weighted Kaplan-Meier analysis, attenders had better survival compared to non-attenders. In the IPTW adjusted models, Vukuzazi health fair attendance was associated with a 25% reduction in the hazard of all-cause mortality (HR=0.75, 95%CI: 0.67, 0.84), corresponding to a 1.5% absolute reduction in mortality over five years. Findings were similar in sensitivity analysis.

Discussion

Participation in a community-based health fair was associated with a reduction in 5-year all-cause mortality. The integration of health fairs with referral practices into standard healthcare delivery within rural areas may be an effective strategy to improve health outcomes.



posters-tuesday-BioZ: 23

Reducing Uncertainty in Fertility Meta-Analysis: A Multivariate Approach to Clinical Pregnancy and Live Birth Outcomes

Mahru Ahmad, Jack Wilkinson, Andy Vail

University of Manchester, United Kingdom

Background:

Meta-analyses of assisted reproductive technology (ART) trials commonly assess clinical pregnancy and live birth as separate outcomes, despite their hierarchical dependency. Many trials report pregnancy but not live birth, limiting the applicability of univariate meta-analyses for live birth outcomes. This can lead to imprecise estimates and uncertainty about intervention effectiveness. Multivariate meta-analysis (MVMA) offers a potential solution by jointly modelling related outcomes, maximizing the use of available data and improving statistical precision.

Objectives:
This study aims to investigate whether Multivariate Meta Analysis (MVMA) provides a more reliable estimation of live birth outcomes compared to traditional univariate meta-analysis. Specifically, we:

  1. Construct an MVMA model incorporating both clinical pregnancy and live birth outcomes using data from systematic reviews of ART trials (2020–2021).
  2. Compare MVMA with univariate approaches, evaluating the extent to which MVMA improves precision and whether this would lead to different inferences. .
  3. Explore different correlation structures between clinical pregnancy and live birth, assessing their impact on effect estimates.

Methods:
Systematic review data from the Cochrane systematic reviews (2020–2021) will be extracted, including trial-level counts of clinical pregnancies and live births for treatment and control groups. MVMA models will be implemented using various correlation assumptions, as well as the use of the Wei and Higgins method to account for the relationship between outcomes. The study will assess the performance of MVMA versus univariate meta-analysis by comparing uncertainty in effect estimates and methodological implications.

Results:
This study will provide insights into whether MVMA can enhance the precision of live birth effect estimates, making better use of incomplete ART trial data. By improving the analysis of imperfectly reported data, this study aims to reduce the considerable uncertainty surrounding many fertility interventions. The results will provide insights into whether MVMA leads to more precise effect estimates compared to univariate methods. The findings will be available at the time of the presentation and will help determine the extent to which MVMA can enhance statistical power when live birth data is incomplete. This work will contribute to methodological advancements in fertility research by optimising the use of available trial data and improving the reliability of conclusions drawn from ART studies.



posters-tuesday-BioZ: 24

Causal discovery for multi-cohort studies

Christine Bang1, Vanessa Didelez2,3

1University of Copenhagen; 2Leibniz Institute for Prevention Research and Epidemiology - BIPS; 3University of Bremen

Causal discovery methods aim to learn causal structures in a data-driven way. The availability of multiple overlapping cohort datasets enables us to learn causal pathways over an entire lifespan. Evidence of such pathways may be highly valuable, e.g. in life course epidemiology. No previous causal discovery methods tailored to this framework exist. We show how to adapt an existing causal discovery algorithm for overlapping datasets to account for the time structure embedded in cohort data. In particular, we show that this strengthens the method in multiple aspects.
We consider causal discovery methods that recover causal structures from (conditional) independencies in a given set of variables. Multiple causal structures may induce the same dependence structure and form an equivalence class. Without additional, stronger assumptions, it is usually not possible to recover more than the equivalence class; i.e. we cannot identify all causal directions. Moreover, when combining multiple datasets, if some variables are never measured jointly their (conditional in-)dependence is by construction unknown. Then, we cannot even identify the equivalence class. Hence, constraint-based causal discovery for multiple datasets suffers from two types of obstacles for identification.
Time structured data induces a partial causal ordering of the variables, which we refer to as tiered background knowledge. It is easy to see that tiered background knowledge improves the identifiability of causal directions. Additionally, we show that tiered background knowledge also improves the (partial) identifiability of the equivalence class, which is not trivial. We provide theoretical results on the informativeness as well as theoretical guarantees of the algorithm. Finally, we provide detailed examples that illustrate how the algorithm proceeds, as well as examples of cases where tiered background knowledge increases the level of informativeness.



posters-tuesday-BioZ: 25

Extension of Causal Interaction Estimation Techniques through Integration of Machine Learning Algorithms

A F M Tahsin Shahriar, AHM Mahbub-ul Latif

University of Dhaka, Bangladesh, People's Republic of

This study explores the challenges of causal interaction analysis, particularly in public health and policy evaluation, where understanding how multiple exposures influence outcomes is crucial. Identifying these interactions is complex due to unobserved confounding, measurement errors, and high-dimensional datasets. Traditional econometric methods, while widely used, often rely on strong assumptions that may not hold in complex real-world scenarios.

This study reviews established causal inference methods, including Difference-in-Differences (DiD), Changes-in-Changes (CiC), and matching. These methods have limitations, particularly in handling high-dimensional data and complex interactions. To address these challenges, this research investigates an alternative approach using machine learning models, specifically Causal Forests and Bayesian Additive Regression Trees (BART), to estimate causal interactions. These models are used to obtain Conditional Average Treatment Effect (CATE) estimates, which are then used to compute the Average Treatment Effect on the Treated (ATET). However, these methods did not consistently outperform traditional methods in simulations, especially with smaller samples.

A key contribution of this study is the development of causal mixture methods, which integrate the adaptability of machine learning algorithms, like Gradient Boosting Machines (GBM) and Random Forests (RF), for first-stage estimation with the interpretability and robustness of traditional econometric frameworks, such as Difference-in-Differences (DiD), to enhance resilience to unmeasured confounding and measurement errors. This approach involves first estimating propensity scores using machine learning methods to capture complex relationships between covariates and treatment assignment. These estimated propensity scores are then integrated into the standard DiD model to improve covariate balance and comparability between treated and control groups, mitigating selection bias and enhancing the robustness of causal estimates. This approach aligns with modern econometric frameworks like Double Machine Learning (DML).

Simulation studies were conducted to assess the performance of various causal inference methods. Data were generated with varying levels of noise to examine the impact of measurement error. The mixture methods, integrating ML-based propensity scores with DiD regression, produced unbiased estimates, demonstrating robustness to measurement error.

In summary, this study advances the field of causal inference by: (i) presenting a detailed comparative analysis of econometric and machine learning-based methods, (ii) proposing causal mixture models that integrate machine learning for robust first-stage estimation, and (iii) comparing bias through simulations. These contributions provide researchers with practical tools and a stronger theoretical foundation for addressing challenges in causal interaction analysis, particularly in high-dimensional and complex settings, ensuring more reliable and interpretable conclusions for decision-making in public health and policy research.



posters-tuesday-BioZ: 26

Embrace Variety, Find Balance: Integrating Clinical Trial and External Data Using Causal Inference Methods

Rima Izem1, Yuan Tian2, Robin Dunn3, Weihua Cao3

1Novartis Pharma AG, Switzerland; 2China Novartis Institutes for BioMedical Research Co., Ltd.; 3Novartis Pharmaceuticals Corporation, USA

Integrating information from multiple sources is important for multiple stakeholders in the development of pharmaceutical products. For example, augmenting the control arm of a randomized controlled trial with external data from previously conducted trials can inform internal decision-making in early development or expedite development in small populations with unmet medical need. Also, leveraging external controls from a disease registry to a single arm trial can make it possible to estimate the comparative treatment effect of the study drug when a randomized comparison is unfeasible or unethical. The main challenge in this data integration is assessing potential biases, due to between-source differences, and minimizing or mitigating these biases in the integrated design and analysis.

This presentation proposes the use of a workflow implementing propensity score methods, developed in observational data, when estimating treatment effects from multiple data sources with individual-level data. First, causal inference thinking can help identify the causal estimand, establish the underlying assumptions, and focus the assessment of between-source heterogeneity on key variables. The use of target trial emulation and balance diagnostics can identify the relevant subset in the external data, assess the extent of adjustment needed, evaluate the plausibility of important assumptions, such as positivity, and assess adequacy of propensity score adjustment. Lastly, for fit-for-purpose external data, a variety of methods can leverage the propensity score to estimate the treatment effect. Our presentation will share practical considerations at each step of the workflow and illustrate its use with case studies and simulated data from pharmaceutical development.



posters-tuesday-BioZ: 27

Revisiting subgroup analysis: A reflection on health disparities using conditional independence

Nia Kang, Tibor Schuster

McGill University, Canada

Introduction: Comparative assessment is deeply ingrained in human nature to answer cause and effect questions. It is also an important feature of methodological rigour, underlying many research designs including randomized controlled trials, epidemiological studies and population-level evaluations for informing health policy. Programs that aim at addressing health disparities often rely on comparisons of health indicators across predefined sub-populations (i.e., groups distinguished by fixed socio-demographic characteristics), rather than by theoretically assignable exposures or interventions.

Although tailoring health policy implications to such subgroups may seem reasonable, this approach risks oversimplification, as the intersectional nature of socio-demographic factors can obscure those with the greatest need, rendering population-level interventions derived from such analyses less effective.

Methods: Using principles from probability theory, we define health parity as the stochastic independence between one or more health indicators and any subdivision of the population conditional on confounding factors. We consider the presence of two or more group-defining features that may intersect within and across subpopulations. We further assume the availability of a program or policy P that has a positive causal impact on the health indicator(s) under study but has limited resource allocation.

Using Bayes’ theorem, we derived a target function that factorizes the tradeoff between decreasing subgroup-specific health disparities and lowering the marginal prevalence of a poor health outcome given practical constraints such as resource availability. We conducted extensive Monte Carlo simulation studies to demonstrate how the proposed function can help identify the most optimal P in terms of maximizing health parity. Factors considered in the simulations are the degree of impact of P, resource availability, number and prevalence of population subgroups, and varying distributions of health outcomes.

Results/Conclusion: The proposed functional approach demonstrated utility in assessing the effectiveness of health programs and policies aimed at maximizing health parity. Although subpopulations defined based on sociodemographic features provide an easy ground for conventional comparative assessment, they may have limited capacity to inform the most effective health policies. Indeed, our findings imply that comparative subgroup analysis should be supplemented with marginal outcome distributions by leveraging the proposed target function approach.



posters-tuesday-BioZ: 28

Comparison of Multiple Imputation Approaches for Skewed Outcomes in Randomised Trials: a Simulation Study

Jingya Zhao, Gareth Ambler, Baptiste Leurent

University College London, United Kingdom

Introduction

Missing outcome data is a common issue in trials, leading to information loss and potential bias. Multiple imputation (MI) is commonly used to impute missing data; one advantage is that it can include additional predictors of 'missingness' that are not in the analysis model. However, standard MI methods assume normality for continuous variables, which is often violated in practice, e.g. healthcare costs are typically highly skewed. Alternative MI approaches, involving Predictive Mean Matching (PMM) or log transformations, have been proposed for handling skewed variables. Using simulation, we compare different methods for imputing missing values of skewed outcome variables in randomised trials.

Methods

We simulated trial data with two treatment arms and correlated skewed baseline and follow-up variables. We considered three different missing data mechanisms for the follow-up variable: missing completely at random (MCAR), missingness associated with treatment arm (MAR-T), and missingness associated with baseline (MAR-B). We compared seven methods: Complete Case Analysis (CCA), Multivariate Normal Imputation (MVN), Multiple Imputation by Chained Equations (MICE), and Predictive Mean Matching (PMM), along with log-transformed versions (LogMVN, LogMICE, and LogPMM) which perform imputation on the log-transformed variables. Assessment of performance focused on bias and confidence interval (CI) coverage when estimating the mean difference between arms. These methods were also applied to the analysis of a healthcare costs trial dataset.

Results

The simulation results showed that LogMVN and LogMICE typically outperformed other methods. MVN and MICE also performed well under MCAR and MAR-T but had poor performance under MAR-B. PMM and LogPMM generally performed poorly, often showing under-coverage. CCA performed well under MCAR but not under MAR mechanisms. When applied to the trial dataset, PMM and LogPMM produced point estimates similar to that of CCA, with the narrowest CIs. Conversely, LogMVN and LogMICE yield higher point estimates, along with the widest CIs. Additional simulations are being performed to explore further results under different outcome distributions, missing data mechanism and sample sizes.

Conclusion

Our results suggest that a log transformation before MI strategy might be useful for handling skewed variables (although non-positive values need careful handling). The performance of MVN and MICE depends on the specific missingness mechanism, and the PMM method cannot be recommended. However, further evaluation alternative data generation mechanisms are needed.



posters-tuesday-BioZ: 29

Assessing the effect of drug adherence on longitudinal clinical outcomes: A comparison of Instrumental Variable and Inverse Probability Weighting methods.

Xiaoran Liang1, Deniz Türkmen1, Jane A H Masoli1,2, Luke C Pilling1, Jack Bowden1,3

1University of Exeter, United Kingdom; 2RoyalDevon University Healthcare NHS Foundation Trust, Exeter, United Kingdom; 3Novo Nordisk Research Centre (NNRCO), Oxford, United Kingdom

Background: Drug adherence refers to the degree to which patients comply with prescribed therapeutic regimens when taking medications and high adherence is essential for ensuring the expected efficacy of pharmacological treatments. However, in routine care settings, low adherence is a major obstacle that frustrates this desired process. For instance, real-world studies report that adherence to commonly provided statin therapy can drop below 50% within the first year of treatment, which is substantially lower than observed in the controlled trials that led to their original approval. Method: In this paper we discuss the use of longitudinal causal modelling to estimate the time-varying causal effects of adherence on patients’ health outcomes over a sustained period. The goal of such analyses is to provide a means for quantifying the impact of interventions to improve adherence on long-term health. If a meaningfully large difference is estimated by such an analysis, the natural focus can then shift to deciding how to realize such an intervention in a cost-effective manner. Two estimation approaches, Inverse Probability Weighting (IPW) and Instrumental Variables (IV), have been proposed in the ‘Estimand framework’ literature to adjust for non-adherence in randomized clinical trials, where non-adherence is viewed as an intercurrent event. We refine and adapt these methods to assess long-term adherence in the observational data setting, which differs in several key respects compared to a clinical trial: Firstly, an absence of overt randomization to treatment, secondly, adherence and longitudinal outcomes only being available in those who are treated. We clarify the assumptions each method makes and assess the statistical properties of each approach using Monte Carlo simulation as well real data examples on statin use for LDL cholesterol control and metformin use for HbA1c control taken from primary care data in UK Biobank.

Results: The findings from our simulations align with theoretical expectations. The IV method effectively accounts for time-varying observed and unobserved confounders but relies on strong, valid instruments and additional parametric assumptions on the causal effects. In contrast, the IPW method addresses observed confounders without requiring additional assumptions but remains susceptib



posters-tuesday-BioZ: 30

Compliance between different anthropometric indexes reflecting nutritional status in women with polycystic ovary syndrome

Aleksander J. Owczarek1, Marta Kochanowicz2, Paweł Madej2, Magdalena Olszanecka-Glinianowicz1

1Health Promotion and Obesity Management Unit, Department of Pathophysiology, Faculty of Medical Sciences in Katowice, Medical University of Silesia in Katowice, Poland; 2Department of Gynecological Endocrinology, Faculty of Medical Sciences in Katowice, Medical University of Silesia in Katowice, Poland

Background: Obesity (mainly diagnosed based on the body mass index – BMI) is the main risk factor for developing polycystic ovary syndrome (PCOS). Based on BMI not all women with PCOS are diagnosed with obesity. However, BMI does not assess visceral fat deposits that play a key role in the pathogenesis of PCOS. Thus, there is a constant search for anthropometric indicators that allow the assessment of visceral fat deposits. This study aimed to assess the comparison of various anthropometric indicators for the diagnosis of excessive fat deposits.

Methods: Based on body mass, height, waist, and hip circumference eleven indexes were calculated: BMI, waist-to-hip ratio (WHR), waist-to-height ratio (WHtR), waist-to-hip-to-height ratio (WHHR), body adiposity index (BAI), a body shape index (ABSI), body roundness index(BRI), weight-adjusted waist index (WWI), abdominal volume index (AVI), CI (Conicity index), Roher or corpulence index (RI). To compare indexes with each other based on the Passing-Bablock (PB) regression, they were scaled to range [0,1]. The serum lipid profile total cholesterol, LDL and HDL cholesterol, triglycerides) as well as triglyceride-glucose index (TyG) were also determined.

Results: The study group encloses 611 women with diagnosed PCOS, with a mean age of 26.3 ± 4.8 (range: 17 – 43) years. There were positive significant linear correlations between indexes (ranging from 0.08 to 0.99), except between ABSI versus BAI and RI. Overall, 55 comparisons between indexes were done with PB regression regarding intercept and slope. Apart from the comparison between BMI vs RI, WHHR vs WWI, and WHR vs ABSI, all methods were different from each other regarding the intercept (ranging from -0.28 to 0.24). Taking slope into consideration, 21 (38.2%) comparisons yielded slopes that did not differ significantly from 1. The highest positive slope value of 1.42 (95% CI: 1.27 – 1.57) was noted in the comparison between WWI vs BAI. The lowest value of 0.72 was noted in the comparison between BAI vs WHtR (95% CI: 0.68 – 0.77). Indexes the most consistent with each other were WHR vs WHtR and ABSI, BRI vs RI, and BMI vs BRI and RI. The highest significant correlations with lipid profile were observed for WHtR while the lowest for ABSI.

Conclusions: Individual anthropometric indexes are not equivalent to each other. Assessment of the level of nutrition using different indicators may lead to over- or underdiagnosis of obesity among women with PCOS.



posters-tuesday-BioZ: 31

Effectiveness of different macronutrient composition diets on weight loss and blood pressure. A network meta-analysis

Katerina Nikitara1, Anna-Bettina Haidich2, Meropi Kontogianni3, Vasiliki Bountziouka1

1Computer Simulation, Genomics and Data Analysis Laboratory, Department of Food Science and Nutrition, School of the Environment, University of the Aegean; 2Department of Hygiene, Social-Preventive Medicine and Medical Statistics, School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki; 3Department of Nutrition and Dietetics, School of Health Sciences and Education, Harokopio University

Background: The scientific evidence surrounding the effectiveness of macronutrient composition on weight loss and reduction of Systolic Blood Pressure (SBP) and Diastolic Blood Pressure (DBP) is conflicting. Advanced analytical methods can be used to examine the effects of different macronutrient compositions. This study explored the effectiveness of diets with different macronutrient compositions for weight loss and blood pressure through network meta-analysis (NWMA).

Methods: A systematic review was conducted by retrieving studies from five bibliographic databases (January 2013 to May 31, 2023). The study population included adults at high risk for cardiovascular diseases, while the outcomes assessed involved markers of glycemic control, obesity, dyslipidemia, and inflammation. Specifically, in the present study, the outcomes of interest were the mean difference in Body Mass Index (BMI), Waist Circumference (WC), SBP and DBP before and after the intervention. The reference diet used for BMI and WC was low-fat (<30%), moderate-carbohydrate (45-60%), and high-protein (19-40%) (LFMCHP) and for the SBP and DBP, low-fat, moderate-carbohydrate, and moderate-protein (10-18%) (LFMCMP), according to the reference diets used in the studies included for each outcome.

Results: Ten studies (n=1,008 individuals) were included in NWMA for BMI, six (n=835) for WC, and seven (n=1,103) for SBP and DBP. The random effect model was used in the NWMA. Results revealed that, compared to the reference diet (LFMCHP), only the high-fat (36-60%), low-carbohydrate (26-44%), high-protein (HFLCHP) diet demonstrated a greater reduction in BMI after the intervention by 0.32 kg/m² (95% CI: -0.34; -0.30, I2=0%, p-value=<0.001). Additionally, the highest ranking in terms of certainty of effectiveness was observed for the high-fat, very low-carbohydrate (<26%), very high-protein (>40%) diet (HFVLCVHP) (P-score: 0.71) compared to other interventions, followed by the HFLCHP diet (P-score: 0.63) and the high-fat, low-carbohydrate, moderate-protein diet (HFLCMP) (P-score: 0.59). Non-significant results were found for WC, SBP, and DBP.

Conclusion: This NWMA suggests that high-fat, low-carbohydrate, high-protein diets may be more effective for BMI reduction, while no significant effects were observed for blood pressure. These findings highlight the potential role of macronutrient composition in weight management but indicate the need for further research to clarify its impact on other cardiometabolic outcomes.



posters-tuesday-BioZ: 32

Going from methodological research to methods guidance: the STandards for the development REseArch Methods guidance (STREAM) initiative

Malena Chiaborelli, Julian Hirt, Matthias Briel, Stefan Schandelmaier

University Hospital Basel, Switzerland

Background: Health researchers need clear and trustworthy methods guidance (e.g. tutorials on handling baseline missing data in trials; best practice regarding calibration of prediction models) to help them plan, conduct, and analyse their studies. Methodological research (based on logic, simulation, or empirical studies) can sensibly inform methods guidance. How to go from methodological research to methods guidance, however, is currently unclear. A new initiative (Standards for the Development of Research Methods Guidance, STREAM) aims to develop a structured process to connect methodological research with methods guidance.

Methods: STREAM includes a series of studies: 1) a scoping review of existing standards to develop methods guidance, 2) a meta-study to assess the current practice of methods guidance development, 3) an interview study to understand the needs of health researchers who use methods guidance, 4) a consensus study to develop standards for methods guidance development, and 5) user testing of these standards in ongoing guidance development projects.

Results: At the conference, we will present the overall initiative and results of the first two studies. The scoping review identified 6 articles addressing the development of methods guidance. Of those, 1 mentioned methodological research (specifically: empirical studies) as an input for guidance development, without specifying a process. None of the included articles mentioned simulation studies as an input. For the meta-study, we reviewed 1202 methods guidance articles, most published after 2018. Of those, 347 reported a development process: 156 (45%) performed a systematic review of the methodological literature, 93 (27%) a consensus process, 71 (20%) user-testing, 43 (12%) empirical studies, and 36 (10%) simulation studies.

Impact: The two initial studies of the STREAM initiative reveal that the literature addressing the development of methods guidance is scarce and limited and that methods guidance articles rarely report a development process. Guidance developers use varying ad hoc approaches to create guidance and rarely seek input from their users (health researchers). The findings suggest that current methods guidance could be improved to make it more helpful for health researchers and better support the production of high-quality evidence. The new standards for the development of research methods guidance will provide explicit solutions to these challenges.



posters-tuesday-BioZ: 33

Effectiveness of a Skill Check Sheet for Registered Dietitians: A Cluster Randomized Controlled Trial Protocol

Misa Adachi1,2, Asuka Suzuki2, Kazue Yamaoka1,3, Mariko Watanabe4, Toshiro Tango1,5

1Nutrition Support Network LLC, Sagamihara, Japan; 2Teikyo University Graduate School of Public Health, Japan; 3Tetsuyu Clinical Research Center, Tetsuyu Institute Medical Corporation, Tokyo, Japan; 4Showa Women’s University, Tokyo, Japan; 5Center for Medical Statistics, Tokyo, Japan

Inroduction:

Registered dietitians (RDs) play a critical role in promoting lifestyle improvements through evidence-based nutrition interventions. To enhance RD competencies in nutrition education, we developed a Skill Check Sheet (SCS) designed to support self-assessment and skill improvement. A preliminary single-group intervention study (3 months) suggested that SCS might effectively improve RD skills. This study aims to evaluate its effectiveness in reducing glycated hemoglobin (HbA1c) levels among patients with type 2 diabetes (T2D) by conducting a cluster randomized controlled trial (cRCT). The intervention compares a validated nutrition education program, the SILE program (Adachi et al., 2017), with an enhanced version incorporating the SCS (SILE+SCS).

Methods and Results:

This 4-month cRCT will randomly assign RDs to one of two intervention arms (SILE+SCS vs. SILE). Each RD will manage seven T2D patients aged 20–80 years. The primary outcome is the change in HbA1c from baseline. The intervention effect will be assessed using an intention-to-treat (ITT) analysis with a generalized linear mixed-effects model, adjusting for covariates.

The sample size calculation was based on previous studies and preliminary data, assuming a standardized mean difference (SMD) of 0.33, an intraclass correlation coefficient (ICC) of 0.01, a two-sided significance level of 5%, and 80% power, with seven patients per RD cluster. This resulted in a required sample of 21 RDs per group. Accounting for a 10% dropout rate, the final target is 23 RDs per group, totaling 322 patients.

Conclusion:

Preliminary findings suggest that SCS may enhance RD skills in nutrition education. This cRCT will rigorously evaluate its effectiveness, ultimately aiming to contribute to the prevention and management of lifestyle-related diseases.

Referrence

Adachi M, Yamaoka K, Watanabe M, et al. Does the behavioural type-specific approach for type 2 diabetes promote changes in lifestyle? Protocol of a cluster randomised trial in Japan. BMJ Open 2017;7:e017838. doi:10.1136/ bmjopen-2017-017838Adachi et al., 2017.



posters-tuesday-BioZ: 34

Bootstrap-based approaches for inference on the total deviation index in agreement studies with replicates

Anna Felip-Badia1, Josep L Carrasco2, Sara Perez-Jaume1,2

1BiMaU, Sant Joan de Déu Pediatric Cancer Center Barcelona, Spain; 2Department of Basic Clinical Practice, Universitat de Barcelona, Spain

Introduction

The total deviation index (TDI) is an unscaled statistical measure used to evaluate the deviation between paired quantitative measurements when assessing the extent of agreement between different raters. It describes a boundary such that a large specified proportion of the differences in paired measurements are within the boundary (Lin, 2000). The inference of the TDI involves the estimation of a 100(1-α)% upper bound (UB), where α is the significance level. Some methods to estimate the TDI and the UB have been proposed (Choudhary, 2008, 2010; Escaramis, 2010). In 2015, Perez-Jaume and Carrasco (P-J&C) proposed a non-parametric method that estimates the TDI as a quantile of the absolute value of the within-subject differences between raters and bootstraps them with two strategies to estimate the UB. Our goal is to assess an alternative bootstrap approach when estimating the UB using P-J&C’s method, and to compare its performance as well as the one of the TDI estimates to that of the already existing methods in the literature.

Methods
We consider two non-parametric bootstrap approaches for studies with replicates: the bootstrap of the within-subject differences and an alternative approach of a cluster bootstrap at subject level. We also consider four strategies to estimate the UB: the ones based on the basic percentile and the normal distribution from P-J&C and two additional ones based on empirical quantiles and BCa confidence limits. This leads to eight different ways of UB estimation. We implement all the above-mentioned methods to estimate the TDI and the bootstrap-based approaches for inference in an R package and conduct a simulation study to compare the performance of all the methodologies considered in this work. Furthermore, we apply them to a real case dataset.

Results
All the methods exhibit a tendency to overestimate the TDI except for Choudhary's 2010 method that seems to underestimate it in all combinations considered in the simulation study. The bias and the mean squared error is reduced when the sample size is increased for all methods, indicating consistent asymptotic properties. Regarding the empirical coverages, the cluster bootstrap approach gives values closer to the expected 95% than the ones from the bootstrap of the within-subject differences. Finally, under real data with replicates all techniques provided similar estimates with the BCa strategy resulting in slightly higher UBs in most cases.

Conclusion
In studies with replicates, when applying bootstrapping to estimate the UB using the P-J&C estimator, the cluster bootstrap approach is recommended.



posters-tuesday-BioZ: 35

Baseline treatment group adjustment in the BEST study, a longitudinal randomised controlled trial.

Robin Young1, Alex McConnachie1, Helen Minnis2

1Robertson Centre for Biostatistics, University of Glasgow, United Kingdom; 2Centre for Developmental Adversity and Resilience (CeDAR), University of Glasgow, United Kingdom

In an RCT with measurements of the outcome variable at baseline and one or more follow up visits, a linear mixed effects regression model can be used. Due to randomisation it would be expected that there is no difference between treatment groups at baseline, and so a model term for treatment effect at baseline can be omitted. It has been shown that such a “constrained baseline analysis” would have more power than if a term for the baseline treatment effect is included in analysis models1.

The BEST2 trial was an RCT assessing the impact of the New Orleans Intervention Model on children entering foster care in the UK, with measurement of outcomes at baseline and two follow up visits. As a result of practical and legal considerations relating to the setting of the study, over the 10 year duration of the trial there were three separate schedules of recruitment: (1) Consent first followed by baseline measures and then randomisation (2) Randomisation followed by consent then baseline (3) Consent followed by randomisation then baseline. As not all participants were recruited with randomisation occurring after baseline, it could not be guaranteed prior to unblinding at the end of the study that the treatment groups were balanced at baseline for the primary outcome. A pre-defined statistical analysis plan for the study therefore took the approach to include a term for the treatment effect at baseline to account for any unexpected differences.

At the conclusion of the trial, there was some degree of difference at baseline between the unblinded treatment groups for the primary outcome, and as a result the choice to include a term for this in the primary analysis model appeared justified. Using the data from the trial in combination with simulations, we will show that there are scenarios where due to study design, or to account for high variability in outcome measures, including the baseline treatment effect may be relevant to consider as either the primary model or as a sensitivity to constrained baseline analysis.

References:

[1] Coffman CJ, Edelman D, Woolson RF, To condition or not condition? Analysing ‘change’ in longitudinal randomised controlled trials. BMJ Open 2016;6:e013096. doi: 10.1136/bmjopen-2016-013096

[2] BEST [Accepted Nature medicine]



posters-tuesday-BioZ: 36

The Subtle Yet Impactful Choices in Procedure to conduct Matching-Adjusted Indirect Comparison - Insights from Simulation

Gregory Chen1, Micahel Seo2, Isaac Gravestock2

1MSD, Switzerland; 2Roche, Switzerland

Population-adjusted indirect treatment comparisons (ITCs) play a crucial role in clinical biostatistics, particularly in the health technology assessment (HTA) space. Demonstrating the comparative effectiveness of an investigational treatment against standard-of-care comparators is essential for both clinical and economic decision-making in reimbursement submissions. However, head-to-head randomized trials for payer-interested comparators are often unavailable at the time of a HTA submission, necessitating the use of indirect comparison methods.

When only aggregate data (AgD) are available for a comparator, the Matching-Adjusted Indirect Comparison (MAIC) method, originally introduced by Signorovitch, has become the go-to approach. Over time, variations and refinements have been introduced in both research and practice. This study conducts a simulation-based evaluation of the bias and relative efficiency of different MAIC estimators for the average treatment effect among treated (ATT), along with an assessment of confidence interval (CI) coverage based on asymptotic derivations, robust variance estimators, and bootstrap methods.

The simulation utilizes {maicplus} R package and is designed to generate insights for both binary and time to event endpoints. The primary focus is on unanchored ITCs, with a secondary analysis of anchored comparisons to assess the robustness of findings. The study examines performance across various scenarios, including different sample sizes, true event rates, and degrees of prognostic factor overlap. Additionally, we investigate the impact of including non-prognostic factors, omitting key confounders, and interactions between these factors. To further contextualize MAIC findings, we incorporate inverse probability of treatment weighting (IPTW) estimators, quantifying the trade-offs in performance metrics when individual patient data (IPD) for the comparator arm are unavailable.

The findings from this study will provide critical insights into the feasibility, reliability, and trade-offs of population-adjusted ITCs, offering guidance on best practices and methodological considerations in comparative effectiveness research.



posters-tuesday-BioZ: 37

Utility-based design: an improved approach to jointly analyze efficacy and safety in randomized comparative trials

Patrick Djidel, Armand Chouzy, Pierre Colin

Bristol Myers Squibb, Switzerland

Introduction

In randomized clinical trials, multiple endpoints are evaluated to assess new treatments, focusing on both efficacy and safety. Traditional oncology study designs often rely on a single primary endpoint, which can overlook other important objectives. Various frameworks, such as those proposed by Murray, Kavelaars, and Park, incorporate multivariate outcomes to improve decision-making by considering the risk-benefit tradeoff. We propose a utility-based design tool, extending Murray’s approach, that accounts for the correlation between efficacy, safety and the cause of death (due to disease progression vs. fatal adverse event).

Methods

The proposed statistical framework is based on a joint probit model as follows: the clinical endpoints are considered categorical (e.g. toxicity grade and objective response rate) and a composite endpoint is derived based on combinations of both safety and efficacy categories and numerical utilities. The utility matrix is obtained via a consensus among clinical trial physicians. Then, to evaluate the treatment effect, we calculate the mean joint probabilities via a joint probit model and combine them with the utility matrix. To support decision-making, a formal test is derived to analyze the improvement of the utility score due to the treatment effect.


Results

We provide a statistical tool to efficiently compare treatment arms from randomized trials and evaluate the efficacy/safety trade-off. A statistical test and a target sample size calculation tool have been developed to properly compare treatment arms for decision making, while controlling Type I and Type II error rates. Some examples of treatment arm comparisons are available using data from oncology studies.

Conclusion

We propose a practical approach to consider the efficacy-safety tradeoff and efficiently compare treatments based on categorical outcomes. The joint probit model considers the correlation between efficacy and toxicity to support multivariate decision-making and efficiently determines whether a treatment is clinically superior to another, by reducing the multidimensional outcome to a single mean utility score. In addition, the benefit-risk ratio is often considered to compare multiple dose levels, looking for the optimal dose. The proposed utility score is useful in summarizing the benefit-risk ratio in early drug development. The statistical test we propose can also be used for dose optimization or seamless designs and combined with commonly used study designs, such as Group Sequential Design.



posters-tuesday-BioZ: 38

Hierarchical Composite Endpoints and win ratio methods in cardiovascular trials: a systematic review and consequent guidance

Ruth Owen1,2,3, John Gregson1, Dylan Taylor2,3, David Cohen4,5, Stuart Pocock1

1London School of Hygiene and Tropical Medicine, United Kingdom; 2Centro Nacional de Investigaciones Cardiovasculares, Spain; 3Oxon Epidemiology, Spain; 4Cardiovascular Research Foundation, NY USA; 5St. Francis Hospital, NY USA

Introduction

The value of hierarchical composite endpoints (and their analysis using the win ratio) is being increasingly recognised, especially in cardiology trials. Their reporting in journal publications has not been previously explored.

Methods

A search of 14 general medical and cardiology journals was done using 13 search terms including “hierarchical composite”, “win ratio”, and “Finkelstein Schoenfeld” during 01/Jan/2022 to 31/Jan/2024. We identified 61 articles (from 36 unique trials) that included analyses using the win ratio. For multiple such articles from the same trial, we selected the most major (or first) one. A standardized proforma was completed by two reviewers (DT+RO), with any inconsistencies resolved by consensus.

Results

Of the 36 trials identified, 10 were in NEJM, 20 were primary publications, and 10 had win ratio as the primary analysis. Most (N=26) were drug trials, but trials of device/surgery (N=7) and treatment strategies (N=3) also occurred. The most common conditions were heart failure (N=15) and ischemic heart disease (N=5).

The choice of hierarchical components varied: nearly all trials (N=32) had mortality as the first comparison, 30 of which had non-fatal events next. The number of non-fatal event components ranged from 0 (4 trials) to 6 (2 trials). In 27 trials, at least one component was a quantitative outcome, most commonly a quality-of-life score, of which 12 defined a minimal margin to claim a win/loss. Hierarchies ranged from 1 to 9 components, with 3 (N=11) and 4 (N=6) components being most common.

Trials usually reported the unmatched win ratio, its 95% CI and Finkelstein-Schoenfeld p-value, with results commonly presented using flowcharts (N=10) or bar charts (N=12). Win odds (4 trials) and win difference (3 trials) were occasionally reported. Stratified (9 trials) and covariate-adjusted analyses (1 trial) were not common. Of the 28 trials that reported the percentage of tied comparisons, 8 had <10% ties whilst 5 had >70% ties.

Specific examples will be presented to illustrate the diversity of good (and sometimes bad) practice in the use and reporting of the win ratio. We conclude with a set of recommendations for future use.

Discussion

This systematic review is the first to document the diversity of uses of hierarchical composite endpoints and win ratio analyses in journal publications. This portfolio of mostly appropriate applications in cardiovascular trials suggests that hierarchical composite outcomes could be relevant in other diseases where treatment response cannot be captured by a single endpoint.



posters-tuesday-BioZ: 39

Power calculation using the win-ratio for composite outcomes in randomized trials

David Kronthaler1, Felix Beuschlein2, Sven Gruber2, Matthias Schwenkglenks3, Ulrike Held1

1Epidemiology, Biostatistics and Prevention Institute, Department of Biostatistics, University of Zurich, Switzerland; 2Department of Endocrinology, Diabetology and Clinical Nutrition, University Hospital Zurich, University of Zurich, Switzerland; 3Health Economics Facility, Department of Public Health, University of Basel, Switzerland

Background: The use of composite outcomes is common in clinical research. These can include, for example, death from any cause and any untoward hospitalization, and corresponding effect measures would be the risk ratio or the hazard ratio typically addressing the time to first occurrence of any of the two events. In these situations, the hierarchy of the outcomes is ignored, and the combination of different outcome distributions is difficult.

Methods: We used the win-ratio approach (Pocock et al. 2024) for the design and sample size calculation of a randomized controlled trial in patients suspected for primary aldosteronism. The win-ratio assumes NT and NC patients in treatment and control group, resulting in NT × NC pairwise comparisons of patients in treatment and control group. The win-ratio is then calculated as RW = NW/NL, with NW and NL being the counts of wins and losses of patients in the treatment group.

The trial has a composite outcome with the following hierarchy:

I Elevated blood pressure (binary, according to WHO definition) and

II Defined daily dose (DDD) of blood pressure medication.

To assess for each comparison whether the patient of the treatment group is the winner or the looser, first hierarchy I, and upon a tie, hierarchy II outcomes are compared. As reference, the power of the trial was compared to a standard sample size calculation for a binary and a continuous outcome with the same specifications.

Results: The power of the trial was assessed with 1000 simulation runs, and with NT = NC = 300 patients, assuming 15% drop-out. Our simulation showed that the resulting power of the trial was 85% and the estimated win-ratio RW was 1.3. Under identical assumptions, standard power calculation would have resulted in 30% power for the hierarchy I outcome, and 73% power for the hierarchy II outcome.

Conclusion: While the win-ratio has been employed in secondary analyses of randomized trials, it has rarely been used at study design level. Sample size calculation using the win-ratio as effect measure is efficient from a methodological perspective, and it captures well the complexities of using potentially censored composite outcomes with a hierarchy in clinical research.

References

Pocock, Stuart J, John Gregson, Timothy J Collier, Joao Pedro Ferreira, and Gregg W Stone. 2024. “The Win Ratio in Cardiology Trials: Lessons Learnt, New Developments, and Wise Future Use.” European Heart Journal 45 (44): 4684–99. https://doi.org/10.1093/eurheartj/ehae647.



posters-tuesday-BioZ: 40

Feasibility of propensity score weighted analysis in rare disease trials: a simulation study

Alexander Przybylski1, Francesco Ambrosetti2, Lisa Hampson2, Nicolas Ballarini2

1Novartis, UK; 2Novartis, Switzerland

Introduction

Clinical trials in rare diseases often face challenges due to small sample sizes and single-arm non-randomized designs, which increase the risk of confounding bias. Propensity scoring (PS) methods are commonly applied to mitigate such biases. However, in small samples, the ability to fit adequate PS models that reduce covariate imbalance has not been widely studied. In the context of an anticipated large treatment effect where the response probability on control is very low, the statistical challenges of using PS weighting for treatment effect estimation are further complicated. Our aim was to evaluate the feasibility and performance of PS methods under these specific conditions.

Methods

A simulation study was conducted to assess the impact of covariate imbalance, sample size, and treatment effect size on the feasibility and performance of several estimators and intervals for the average treatment effect in the treated (ATT; expressed as a difference in marginal risks). The focus was on two key baseline covariates; large treatment effects informed by prior knowledge; and a small sample size of 15 subjects per arm. Weighted and unweighted ratio estimators, a hybrid approach incorporating PS model convergence and covariate imbalance criteria, and standardization-based estimators were evaluated according to estimator convergence rate, the probability of proceeding with an indirect comparison based on measures of imbalance (standardized mean difference; SMD), and bias. Coverage probabilities of intervals were also calculated.

Results

Standardization-based estimators were unreliable due to low sample size and complete separation issues. Propensity score models could be estimated and were able to reduce imbalance even with small sample sizes and high imbalance. Setting a 0.1 SMD threshold for adequate covariate balance, 25% of simulation runs met the criteria for performing the indirect comparison analysis. The use of a less conservative 0.25 threshold for SMD increased this probability to 50% while maintaining acceptable bias and coverage probability. Conditional on observing at least one response in the control arm, average conditional bias was marginally improved via propensity score weighting.

Conclusions

Propensity score weighting methods can address confounding biases in non-randomized studies, even with small sample sizes and large treatment effects. However, in our setting, the most suitable approach involved using a hybrid method that combines pre-specified criteria for performing the indirect comparison.



posters-tuesday-BioZ: 41

A basket trial for rare diseases, with a crossover design for its substudies: a simulation study

Elena G Lara, Steven Teerenstra, Kit C.B. Roes, Joanna IntHout

Radboud University Medical Center, Netherlands, The

Background. Recent advancements in precision medicine generate therapy options for rare diseases. Assessing a new treatment targeted to a rare disease subgroup can make recruiting the required sample size even more challenging. Current work recommends grouping rare diseases in a basket trial, where one drug is evaluated in multiple diseases based on a shared etiology (e.g. gene mutation). This allows to include more patients and to borrow information between substudies. A further recommendation to improve efficiency for trials involving chronic and stable conditions is the use of crossover designs. Our research focuses on basket trials with a crossover design for the substudies. These may increase precision of the estimated treatment effect, both by borrowing information across substudies, as well as efficient substudy design.

Methods. In this study, we evaluated the operating characteristics of basket trials where each substudy corresponds to a crossover design via Monte Carlo simulation. We generated realistic scenarios related to the SIMPATHIC project, under parallel and crossover designs, and with different numbers of substudies (from 2 to 9). We applied estimation methods including random-effects meta-analysis, Bayesian hierarchical modelling (BHA), EXNEX, adaptive lasso, stratified analysis and naïve pooling. And we studied the bias, precision, power and false positive rate of the substudy estimates as well as the trial overall estimates.

Results. The efficiency gains of crossover designs in conventional trials are also present in basket trials. Methods that use information borrowing improve estimation of substudy treatment effects in terms of increased precision. This increase in precision is lower in substudies with a crossover design compared to the parallel-group design with the same number of patients; borrowing in this setting also results in lower shrinkage. Among the borrowing methods evaluated, EXNEX seems the most able to discriminate between substudies with a true effect and those with a small or null effect. Meta-analysis, BHA and naïve pooling achieve the highest power for the overall estimate, although this power is low when the treatment had a true effect in less than half of the substudies.

Conclusion. The incorporation of crossover designs to basket trial substudies - when assumptions are met - results in a more efficient design and practicable sample sizes compared to parallel-group designs. Besides, adding randomization and a control per substudy provides a more valid inference than a single arm design. Altogether, this design can facilitate drug development for rare diseases.

Project funded by Horizon Europe (Grant no. 101080249).



posters-tuesday-BioZ: 42

Comparing randomized trial designs in rare diseases with longitudinal models: a simulation study showcased by Autosomal Recessive Cerebellar Ataxias

Niels Hendrickx1, France Mentré1, Alzahra Hamdan2, Mats Karlsson2, Andrew Hooker2, Andreas Traschütz3,4, Cynthia Gagnon5, Rebecca Schüle6, ARCA Study group7, EVIDENCE-RND Consortium7, Matthis Synofzik3,4, Emmanuelle Comets1,8

1Université Paris Cité, IAME, Inserm, F-75018, Paris, France; 2Pharmacometrics Research Group, Department of Pharmacy, Uppsala University, Uppsala, Sweden; 3Division Translational Genomics of Neurodegenerative Diseases, Hertie Institute for Clinical Brain Research (HIH), University of Tübingen, Tübingen, Germany; 4German Center for Neurodegenerative Diseases (DZNE), Tübingen, Germany.; 5Centre de Recherche du CHUS Et du Centre de Santé Et Des Services Sociaux du Saguenay-Lac-St-Jean, Faculté de Médecine, Université de Sherbrooke, Québec, Canada.; 6Hertie-Center for Neurology, University of Tübingen, Tübingen, Germany; 7Group author; 8Univ Rennes, Inserm, EHESP, Irset - UMR_S 1085, 35000, Rennes, France.

Background:

Parallel designs with an end-of-treatment analysis are commonly used for randomised trials (1), but they remain challenging to conduct in rare diseases due to small sample size and heterogeneity. A more powerful alternative could be to use model-based approaches (2,3). We investigated the performance of longitudinal modelling to evaluate disease-modifying treatments in rare diseases using simulations. Our setting was based on a model describing the progression of the standard clinician-reported outcome SARA score in patients with ARCA (Autosomal Recessive Cerebellar Ataxia), a group of ultra-rare, genetically defined, neurodegenerative diseases (4).

Methods:

We performed a simulation study to evaluate the influence of trials settings on their ability to detect a treatment effect slowing disease progression, using a previously published non-linear mixed effect logistic model (5). We compared the power of parallel, crossover and delayed start designs (6,7), investigating several trial settings: trial duration (2 or 5 years); disease progression rate (slower or faster); magnitude of residual error (σ=2 or σ=0.5); number of patients (100 or 40); method of statistical analysis (longitudinal analysis with non-linear or linear models; standard statistical analysis), and we investigated their influence on the type 1 error and corrected power of randomised trials.

Results:

In all settings, using non-linear mixed effect models resulted in controlled type 1 error and higher power (88% for a parallel design) than a rich (75% for a parallel design) or sparse (49% for a parallel design) linear mixed effect model or standard statistical analysis (36% for a parallel design). Parallel and delayed start designs performed better than crossover designs. With slow disease progression and high residual error, longer durations are needed for power to be greater than 80%, 5 years for slower progression and 2 years for faster progression ataxias.

Conclusion:

In our settings, using non-linear mixed effect modelling allowed all three designs to have more power than a standard end-of-treatment analysis. Our analysis also showed that delayed start designs are promising as, in this context, they are as powerful as parallel designs, but with the advantage that all patients are treated within this design.

References:

(1) E9 Statistical Principles for Clinical Trials, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/e9-statistical-principles-clinical-trials, 2020

(2) Synofzik et al. Neuron 2019

(3) Buatois et al; Statistics in Medicine 2021

(4) Karlsson et al. CPT Pharmacometrics Syst Pharmacol 2013

(5) Hamdan et al. CPT 2024

(6) Liu-Seigfert et al. PLoS ONE 2015

(7) Wang et al. Pharmaceutical Statistics 2019



posters-tuesday-BioZ: 43

Sequential decision making in basket trials leveraging external-trial data: with applications to rare-disease trials

Giulia Risca1, Stefania Galimberti1, Maria Grazia Valsecchi1, Haiyan Zheng2

1Bicocca Bioinformatics Biostatistics and Bioimaging B4 Center, Department of Medicine and Surgery, University of Milan-Bicocca, Monza, Italy; 2Department of Mathematical Sciences, University of Bath, Bath, UK

Introduction: Rare diseases present unique challenges in the design of clinical trials due to a small pool of eligible patients. Planning rare-disease studies within a basket trials, which can simultaneously evaluate a new treatment in patients with a shared disease trait, is practical because of borrowing strength from relevant patient subgroups. Motivated by a real rare-disease trial under planning, we develop a Bayesian sequential design that allows incorporation of both external-trial and within-trial data for basket trials involving rare diseases.

Methods: We consider two subgroups of patients to receive the same treatment before deciding on if a third one would be treated in the basket trial. The EXNEX method1 is extended to include a prior mixture component formed using external-trial data. That is, the treatment effects in those three subgroups are assumed to be exchangeable, or non-exchangeable but consistent with the external-trial data, or completely extreme. On the completion of the first two subgroups, our Bayesian meta-analytic-predictive model is used to obtain the predictive probability (PP) of an efficacious treatment in the third subgroup. Interim futility assessment is guided using a power spending function.

Results: We assess the performance of this design through simulations, which results sensitive to the choice of certain parameters (e.g., prior mixture weights, cut-offs for the interim and the final analyses). Specifically, the PPs at the first interim are highly dependent on the different allocation weights. Pessimistic scenarios have large variability in PPs depending on whether the exchangeability or the prior-data consistency assumption is violated. However, it is generally robust when there is strong belief in a highly effective treatment and all models seem to accurately estimate the true treatment effect in each subgroup in terms of bias and mean squared error. Finally, the marginal type I error is always well controlled.

Conclusions: In conclusion, our method allows mid-course adaptation and ethical decision-making. It is novel and can address critical gaps in rare diseases. The principles are generalizable to other contexts.

References:

1. Neuenschwander, B., Wandel, S., Roychoudhury, S. & Bailey, S. (2016) Robust exchangeability designs for early phase clinical trials with multiple strata. Pharmaceutical Statistics, 15, 123–134. Available from: https://doi.org/10.1002/pst.1730



posters-tuesday-BioZ: 44

Adaptive Designs and Bayesian Approaches: The Future of Clinical Trials

Anjali Yadav

JSS Medical Research Asia Pacific Pvt. Ltd., India

Background / Introduction

Traditional clinical trial designs rely on fixed protocols that do not allow for modifications once the study is initiated. This rigidity can lead to inefficiencies, ethical concerns, and prolonged development timelines. Adaptive designs provide a flexible framework that permits pre-specified modifications based on interim analyses, improving resource allocation and patient outcomes. Meanwhile, Bayesian approaches leverage prior knowledge and continuously update probabilities, offering a more dynamic and intuitive method for decision-making. The integration of these methodologies has the potential to revolutionize clinical trial efficiency, particularly in the era of precision medicine and rare disease research.

Methods

This study reviews key adaptive design strategies, including group sequential, response-adaptive, and platform trials, highlighting their statistical foundations and regulatory considerations. Bayesian methodologies, such as Bayesian hierarchical modeling and predictive probability monitoring, are explored in the context of trial adaptation and decision-making. Case studies from oncology, vaccine development, and rare disease trials are examined to illustrate the real-world application and advantages of these approaches.

Results

Adaptive designs have demonstrated significant reductions in trial duration and costs while maintaining scientific integrity. Bayesian methods have enhanced decision-making by incorporating historical data and real-time learning, leading to more efficient dose-finding, early stopping for efficacy or futility, and improved patient allocation. Regulatory agencies, including the FDA and EMA, have increasingly supported these innovative methodologies, providing frameworks for their implementation. Case studies highlight improved success rates, patient safety, and ethical advantages compared to traditional approaches.

Conclusion

The adoption of adaptive designs and Bayesian approaches is transforming clinical research by making trials more efficient, ethical, and informative. While challenges remain, including regulatory acceptance, operational complexity, and computational demands, ongoing advancements in statistical methods and trial simulations continue to enhance their feasibility. The future of clinical trials lies in the strategic integration of these methodologies, fostering a more flexible and patient-centric approach to drug development.