Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
Prediction / prognostic modelling 2
Time:
Monday, 25/Aug/2025:
2:00pm - 3:30pm

Location: ETH E23

D-BSSE, ETH, 84 seats

Show help for 'Increase or decrease the abstract text size'
Presentations
11-prediction-prognostic-2: 1

Assessing time-dependent discrimination of prognostic model with intercurrent treatment: use of multi-state modelling for inverse probability censoring weighting

Loïc Vasseur1,2, Derek Hazard3, Nicolas Boissel2, Martin Wolkewitz3, Jérôme Lambert1,4

1Epidemiology and Clinical Statistics for Trials & Real-world evidence Research (ECSTRRA), Saint Louis Research Institute, UMR1342, INSERM, Université de Paris, France; 2Adolescent and Young Adult Hematology Unit, Saint Louis University Hospital, Assistance Publique-Hôpitaux de Paris (AP-HP), Paris, France; 3Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany; 4Biostatistics and Medical Information Department, Saint Louis University Hospital, AP-HP, Paris, France

Introduction:

Prognostic markers may be used to inform clinical decision about treatment. As a consequence, it is difficult to detangle the prognostic value of the marker from the treatment effect especially when treatment is started after baseline. Following the estimand framework, a solution is to evaluate discrimination using an “hypothetical” strategy where treatment would never have been started, by censoring follow-up at the time of intercurrent treatment.

When assessing discrimination performances for time to event outcome at a given time t, time-dependent receiver area under the receiver operating characteristic curve is frequently used, with inverse probability censoring weighting (IPCW) to account for censoring before t. Existing methods for IPCW can be marker-independent or marker-dependent but always assume a unique censoring mechanism.

Our aim was to consider the differential impact of the marker on censoring due to loss of follow-up and censoring due to intercurrent treatment when evaluating discrimination performances of a baseline prognostic marker which could guide treatment decision.

Methods:

We used the Inverse Probability of Censoring Weighting (IPCW) estimation of Cumulative/Dynamic time-dependent AUC to assess discrimination.

Through a time-inhomogeneous multi-state model where the two censoring processes (due to loss of follow-up or intercurrent treatment) correspond to two distinct states, censoring is allowed to depend differentially on the marker (IPCWMSM). This method is compared to IPCW calculations where censoring due to intercurrent treatment or loss to follow-up are combined and considered either marker-independent using a Kaplan Meier estimate of censoring distribution (IPCWKM) or marker-dependent using a Cox proportional hazard model (IPCWCox). Through simulations we compared bias, coverage probability, and Root Mean Square Deviation (RMSD) of the 3 methods.

Finally, the different methods were illustrated in a cohort of patients treated for acute leukemia to assess the prognostic value of European Leukemia Net 2022 classification, a baseline risk stratification marker, with allogenic hematopoietic cell transplantation in first complete remission as the intercurrent treatment.

Results:

In the simulation study, time-dependent AUC estimated with IPCWMSM showed reduced bias, better coverage, and lower RMSD, compared to IPCWKM and IPCWCox.

In the motivating example, the methods provided differing results, which could lead to inconsistent conclusions.

Conclusion:

When a prognostic marker is used to guide treatment, if we aim to produce unbiased estimate of discrimination performances in the absence of treatment, separating censoring mechanisms due to loss of follow-up and intercurrent treatment using multi-state modelling to estimate IPCW in the calculation of time-dependent AUC is recommended.



11-prediction-prognostic-2: 2

Evaluating the discrimination of prediction models for recurrent medical events

Thomas Joel Spain1, Alexandra Hunt1, Hein Putter2, Victoria Watson3, Laura Bonnett1

1University of Liverpool, United Kingdom; 2University of Leiden, Netherlands; 3Phastar, United Kingdom

Background
Clinical prediction models combine multiple pieces of patient information to predict a clinical outcome for individuals with underlying health conditions. Evaluating a model's ability to distinguish between those who experience the outcome and those who do not, referred to as discrimination, is a key step in model development.
Prediction models are often developed using logistic regression, time-to-event methods, or increasingly, machine learning. Tools and methods are available to assess discrimination and calibration in these models. However, many medical conditions, such as recurrent seizures in epilepsy or repeated asthma exacerbations, are characterized by repeated episodes of the same type. While methodology and tools exist to evaluate the fit and calibration of recurrent event prediction models, there are currently no approaches to evaluate discrimination for these models.
Methods
We propose an alternative concordance statistic, which evaluates predicted and observed event counts, and demonstrate it using simulated data that reflects annual, monthly, and weekly repeated medical episodes. We present R code embedding C++ to minimize computation time, which includes methodology to calculate confidence intervals for the concordance statistic using the jackknife resampling method. This flexibility allows users to tailor the analysis to their specific modelling needs. The code will ultimately be incorporated into an R package to enhance accessibility for researchers.
Results
Embedding C++ within the R code significantly improved computational efficiency. The original R code, when evaluating discrimination for a prediction model developed as part of the PRISE study, required over 24 hours to run. In contrast, the optimized R code embedding C++ completed the same evaluation in under a second. Detailed results on the simulated data, including comparisons across annual, monthly, and weekly event frequencies, will be presented.
Conclusions
This work addresses a critical gap in evaluating discrimination for recurrent event prediction models. The proposed methodology and C++-embedded code provide researchers with a practical and efficient tool. This should ensure that prediction models for recurrent medical episodes can be developed and validated to the same standards as those required by the TRIPOD reporting guidelines, and thus meeting best statistical practice.



11-prediction-prognostic-2: 3

A Novel Dynamic Prediction Model Based on Interpretable Deep Learning Using Restricted Mean Survival Time

Pansheng Xue, Zheng Chen

Southern Medical University, China, People's Republic of

l. Introduction

In the field of healthcare, survival prediction models have exhibited immense value in clinical practice. Most existing models utilize survival probability as the prediction outcome, making it challenging to answer an important question: “How much longer will the patient live or live without experiencing disease progression?” Thus, a promising clinical time-to-event measure—the restricted mean survival time (RMST) is recommended. In dynamic prediction, RMST can be extended to conditional RMST (cRMST) to adjust the life expectancy on the basis of the time a patient has already survived. Especially in chronic progressive diseases such as Alzheimer's disease, the patient's life expectancy can be reflected and updated, which is more intuitive to interpret.

lI. Methods

Neural networks have shown powerful capabilities in prediction task but limited by their inherent non-interpretability. Therefore, we developed a novel interpretable RMST dynamic prediction model called Dynamic-DeepRMST, in which cRMST is used as the prediction outcome and integrates deep learning techniques. To enhance both accuracy and interpretability, we modified the Transformer model to effectively capture longitudinal data representations for prediction and to transparently quantify the input‒output relationship.

lII. Results

Compared with existing RMST static and dynamic regressions, the proposed model demonstrated superior performance on concordance index and mean absolute error under different simulation scenarios. Furthermore, the model was applied to Alzheimer's Disease Neuroimaging Initiative (ADNI) data to explain the impact of covariates on survival time from both individual-level and population-level analyses, providing insights for clinical prognosis. Our model not only reveals how longitudinal changes in covariates differentially affect survival time but also identifies the relevance among covariates and their importance in survival prediction evolving over time.

lV. Conclusion

Considering the nature of chronic progressive diseases such as regular follow-ups to track disease progression, we developed a dynamic prediction model based on a more intuitive time-scale-based measure, i.e. RMST, than survival probability. By modifying the Transformer architecture, our model can effectively capture the longitudinal covariate trajectory and inherent relevance while avoiding parametric structural constraints, and achieves comprehensive interpretability by quantifying the input‒output relationship, providing prognosis insights at both the individual and population levels. To our knowledge, this paper is the first in which an interpretable deep learning dynamic prediction model was investigated.



11-prediction-prognostic-2: 4

Correcting for differential diagnosis bias across protected attributes in clinical prediction models using cancer stage information and causal inference

Jose Benitez-Aurioles1, Ricardo Silva2, Matthew Sperrin1

1University of Manchester, United Kingdom; 2University College London, United Kingdom

Background

Recently, more clinical prediction models are developed using large datasets from routine clinical practice, such as electronic health records. These usually have larger sample sizes and are more representative of the general population, but do not have the same data quality assurances as ‘traditional’ clinical studies. Particularly, underdiagnosis in routine care is a concern. In England, around 30% of people with type 2 diabetes or hypertension are undiagnosed, and models trained on data from clinical practice will underestimate the overall incidence of these conditions. Differential underdiagnosis happens when a patient’s characteristics affect their likelihood of diagnosis. If these characteristics are protected attributes like gender, ethnicity, or socio-economic status, clinical prediction models can exacerbate inequalities by diverting resources away from underserved groups to those already better serviced. Differential underdiagnosis is hard to address, as it is not easily measured. We propose a novel method to correct for differential underdiagnosis in cancer prediction models.

Methods

In epidemiology, underdiagnosis in cancer is often indirectly measured through diagnostic delay, as some underserved groups are sicker at the time of diagnosis. If there are quantitative markers of disease progression, these could be used in order to understand which groups are diagnosed later, and correct for this. We show that this is possible in the specific case of cancer stage, assuming that all people with late-stage cancer have the same probability of being tested and diagnosed. We take a causal longitudinal approach, defining our estimand as the counterfactual patient-level risk of being diagnosed in a world in which the probability of being tested is not affected by baseline patient characteristics. By leveraging the difference in the ratio of early to late-stage cancer diagnoses across groups, we can estimate the relative probability of an individual with early-stage cancer to get tested compared to a reference population. This estimate can be used, in turn, to appropriately adjust the prediction scores of groups commonly diagnosed at later stages.

Results

We provide theoretical proofs of the identifiability of these counterfactual predictions, and show how to estimate them in practice. We will use a simulation to evaluate the method and benchmark it against alternative approaches.

Conclusion

This work has potential applications in cancer screening, particularly in considerations of fairness in early detection. Further work will explore alternative applications and extend the concept to continuous, instead of binary, markers of disease progression.



 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: ISCB46
Conference Software: ConfTool Pro 2.6.154+TC
© 2001–2025 by Dr. H. Weinreich, Hamburg, Germany