41-missing-data-imputation: 1
Missing value imputation methods in prediction model development: a neutral comparison of approaches
Manja Deforth1,2, Georg Heinze3, Ulrike Held1
1Department of Biostatistics at the Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Zurich, Switzerland; 2MSD Merck Sharp & Dohme AG, Zurich, Switzerland; 3Center for Medical Data Science, Institute of Clinical Biometrics, Medical University of Vienna, Vienna, Austria
Background: In the development of prognostic models, missing predictor data are quite common and can be handled by using imputation methods. In a neutral comparison study, we aimed at comparing three popular imputation methods by simulating data resembling a Swiss multicenter prospective cohort study on long COVID.
Methods: Assuming a binary outcome and nine predictors, we designed 36 scenarios, resulting from different sample sizes, proportions of noncomplete cases and missingness mechanisms in 3 predictors. The missing data was imputed by applying three imputation algorithms: missForest, aregImpute and mice. The missForest algorithm is based on a random forest methodology, while aregImpute uses flexible additive imputation models and samples drawn from a predictive posterior distribution. In mice linear-additive models are used for imputations. We conducted a single imputation for missForest, 5 and 100 imputations for mice and 100 imputations for aregImpute. Prediction models were estimated on the imputed datasets using linear-additive logistic regression. We also performed complete case analysis without imputing the missing data. All prognostic models were validated on validation cohorts without missing values by evaluating overall performance (scaled Brier score), model discrimination (c-statistic), calibration intercept and slope.
Results: Complete case analysis resulted in the lowest prediction model performance, and for the imputation methods a higher proportion of missing values was associated with lower model performance. Scaled Brier scores from mice and aregImpute models were higher than for missForest. aregImpute preformed remarkably well, yielding a calibration slope close to one which was even higher than if no data were missing. Model calibration was influenced more strongly by the imputation method than model discrimination.
Conclusion: Using aregImpute for the imputation of missing values resulted in shrinkage of regression coefficients of the prediction model, leading to a near optimal calibration slope. The usage of multiple imputed methods such as mice and aregImpute can be recommended in most cases, and when it is not possible to ascertain the true values of missing data.
Publication: Deforth M, Heinze G, Held U. The performance of prognostic models depended on the choice of missing value imputation algorithm: a simulation study. J Clin Epidemiol 2024; 176:111539.
Disclaimer: This work was done while the first author was working at the University of Zurich.
41-missing-data-imputation: 2
Assessing the Impact of Percentage of Missing Data and Imputation Methods on Youden Index Estimation
Sergio Sabroso-Lasa1, Luis Mariano Esteban2, Tomás Alcalá-Nalvaiz3, Núria Malats1
1Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO) and CIBERONC, Madrid, Spain; 2Department of Applied Mathematics, Escuela Universitaria Politécnica La Almunia, University of Zaragoza, Zaragoza, Spain; 3Department of Statistical Methods, University of Zaragoza, Zaragoza, Spain
Background
The rapid advancement of computational methods and data collection technologies has resulted in an exponential increase in the generation of large, complex datasets with numerous variables, often sourced from multiple origins. As a result, managing missing data has become a critical challenge in statistical modeling, especially in large databases, where the percentage and distribution of missing data can significantly affect analytical outcomes.
Effectively imputing missing values is crucial for ensuring the reliability of statistical inferences and predictive models. Although various imputation techniques are available, understanding how the proportion and distribution of missing data influence model performance remains a complex issue that requires further investigation. While previous research has examined the impact of missing data on model discrimination through metrics like the area under the ROC curve (AUC), the specific effect on the Youden Index (J), a key measure of test effectiveness that combines sensitivity and specificity, has not yet been explored.
Method(s) and Results
We conducted simulations under realistic conditions to assess the impact of missing data on the estimation of the Youden Index. These scenarios included independent normally distributed variables with varying predictive capacities, predefined correlation structures, categorical variables, and skewed distributions. Additionally, we analyzed cases where missing data followed specific predefined patterns.
We applied various imputation methods, including MissForest, Multivariate Imputation by Chained Equations (MICE), and K-nearest neighbors, to evaluate the predictive value of the models across different levels of missing data, which ranged from 5% to 75%. The effectiveness of each method was assessed using key diagnostic metrics such as AUCs, sensitivity, specificity, and the Youden Index.
Our findings indicate that most diagnostic metrics decrease by 20–30% compared to models with complete data, except for specificity, which remains comparatively robust. Importantly, the Youden Index varies significantly based on the proportion of missing data, highlighting the challenge of establishing an optimal cutoff point in clinical practice when dealing with incomplete datasets.
Conclusions
Our findings underscore the significant impact of missing data on diagnostic metrics, particularly regarding the proportion of missing values. While most predictive models incorporate an imputation step, few account for how the distribution of missing data influences overall model performance. This analytical oversight restricts potential enhancements in evaluation metrics and can lead to unreliable Youden Index values and cutoff points. These findings highlight the necessity for further research to refine imputation strategies and improve the reliability of predictive models in the context of missing data.
41-missing-data-imputation: 3
Recorded Reasons of Missingness-Informed Sensitivity Analyses in Clinical Trials
Dries Reynders1, Jammbe Musoro2, Saskia le Cessie3, Els Goetghebeur1
1Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium; 2European Organisation for Research and Treatment of Cancer (EORTC) Headquarters, Brussels, Belgium; 3Department of Biomedical Data Sciences and Department of Clinical Epidemiology, Leiden University Medical Center, The Netherlands
In clinical trials, missing longitudinal outcome data, like patient reported outcomes (PROs) is most often addressed assuming missing at random (MAR)and fitting a mixed model, sometimes accompanied by missing not at random (MNAR) sensitivity analyses. Typically, these sensitivity analyses add a shift towards worse outcomes to the MAR-imputed values or use reference-based imputation as a one size fits all approach.
This practice leaves room for more tailored approaches where different categories of missing outcomes are handled differently. If available, categorizing can be based on recorded reasons for missingness (too ill, inconvenient, administrative failure,…). These allow not only to assess the credibility of the MAR-assumption but also may inform more differentiated and realistic MNAR sensitivity-scenarios. To this end, we can link the recorded reasons for missingness to later observed outcomes.
For intermittent missingness, surrounding observations may display very different patterns across reasons for missingness. If these patterns essentially reflect the underlying truth – e.g. observed stable patterns in patients with missingness due to administrative failure, but tendency to decline in patients who were to ill at some visit – assuming MAR may still be reasonable. On the other hand these differences may inform different MNAR-imputation scenario with handling missingness dependent on the reason.
With attrition, the longitudinal outcomes are no longer observed, but adverse events and intercurrent events like disease progression or death may still be recorded. These too present a vehicle to set up more credible sensitivity analyses or MAR-imputation.
The usefulness of the information captured by the recorded reasons for missingness, is not confined to the longitudinal analysis itself. If reasons for missingness prove to be predictive of death or drop-out, it may also be used to relax the non-informative censoring assumption for the time-to-event analyses. Incorporating these reasons in inverse probability of censoring weighting-analyses may then provide more robust evidence.
In a large randomized oncology trial with high mortality comparing radiotherapy alone with adjuvant and concomitant chemotherapy, we investigate in depth how recorded reasons for missing patient reported outcome-data (PRO) are linked to observed outcomes, overall survival and its censoring. Building on the resulting insights, we set up sensitivity analyses for the PRO’s and overall survival. Comparing these with more standard analyses, reveals the value of recording these reasons.
41-missing-data-imputation: 4
Adjusting for outcome reporting bias in meta-analysis: a multiple imputation approach
Cora Burgwinkel1, Leonhard Held1,2
1Epidemiology, Biostatistics and Prevention Institute (EBPI), Universität Zürich, Switzerland; 2Center for Reproducible Science (CRS), Universität Zürich, Switzerland
Background: Outcome reporting bias (ORB) occurs when research study outcomes are selectively reported based on their results. ORB potentially undermines the credibility and validity of meta-analyses and contributes to research waste by distorting overall treatment effects. ORB can be viewed as a missing data problem where unreported outcomes introduce bias. Despite the serious implications ORB poses, it remains an underrecognized issue, with only a few adjustment methods available.
Methods: We propose an approach that addresses unreported studies in meta-analyses through multiple imputation. The imputed data are reweighted using importance sampling to provide an adjusted estimate of the treatment effect, building on existing methods for selection bias from the literature [1]. To assess the impact of ORB in meta-analyses of clinical trials, we apply our proposed methodology to real clinical data affected by ORB. Additionally, we conduct a simulation study to evaluate the method’s performance, focusing on treatment effect estimation across varying degrees of selective non-reporting.
Results: The proposed method successfully adjusts for ORB under assumptions of selective non-reporting. The results demonstrate that ORB can significantly affect the conclusions of a meta-analysis, particularly when the number of unreported studies is large.
Conclusion: Imputing unreported outcomes provides a promising approach to address ORB in meta-analyses. The method assumes a specific mechanism for non-reporting and has been applied exclusively to summary-level data. Besides, so far only the univariate approach has been explored, meaning ORB adjustment was investigated separately for each outcome. Further research is required to extend our approach to multivariate meta-analysis, allowing for simultaneous adjustment of multiple outcomes. Additionally, applying the proposed method on individual patient data (IPD) could provide more precise and reliable ORB adjustment.
References [1] James Carpenter, Gerta Rücker, and Guido Schwarzer. Assessing the sensitivity of meta-analysis to selection bias: a multiple imputation approach. Biometrics, 67(3):1066–1072, 2011.
41-missing-data-imputation: 5
Comparing Estimation Methods for Expected Quality of Life and Predicted Patient-Specific Trajectories in oncology: Addressing missingness/death (not) at random
Eline Vanderpijpen, Els Goetghebeur
Ghent University (Belgium)
Cancer treatments affect both patient survival and quality of life (QoL). Insights into expected QoL and the variation in patient-specific QoL predicted under different treatments may therefore support treatment selection when combined with survival curves. The survival curve with expected `QoL measures - while alive’ nevertheless remains an important two-dimensional target estimand for treatment policy evaluation.
We adapted and compared various statistical methods to assess average QoL - while alive and predict individual QoL trajectories under different treatments. Missing data following intercurrent events, like treatment discontinuation, are common in this setting, and at least missing at random (MAR), while death is likely not at random (DNAR). Weighted generalized estimating equations (WGEE), mixed models, and joint models allow in their own way for MAR missingness in the longitudinal setting besides DNAR.
At specific time points the average QoL among the living can be estimated using weighted GEE - or double robust estimators - weighting for missingness but not death at each time point. This performed well, even with data simulated under joint models. In contrast, standard mixed and joint models naturally target a hypothetical estimand, averaging QoL over an ‘immortal population’ by implicitly imputing QoL after death. When death is not at random, such mixed model estimates are biased for this hypothetical estimand. By using a shared parameter model framework to analyse both survival data and QoL measurements, joint models avoid this bias. We found their derived parametric estimators for the average QoL among the living, to be less accurate than those from WGEE, possibly due to finite samples.
Mixed models and joint models can predict patient-specific QoL trajectories once the random effects are estimated from best linear unbiased predictors, or expectation-maximization (EM) estimators, respectively. Under DNAR, predictions from mixed models were again biased, but joint model predictions performed reasonably. With normal random effects, however, EM-estimates shrunk towards the mean QoL, generating overly optimistic predictions for the most ill patients. For improved accuracy, we explore alternative methods for estimating random effects, including a mixture of prior normal distributions.
We perform a phase 1 simulation study and re-analysed a randomized oncology trial with high mortality rates using the methods described. Our findings support using WGEE for the estimands, while reworked joint models may allow to evaluate QoL in cancer trials for more personalized treatment decisions.
|