Conference Agenda

Session
Meta-analysis 2
Time:
Tuesday, 26/Aug/2025:
11:30am - 1:00pm

Location: ETH E27

D-BSSE, ETH, 84 seats

Presentations
28-1 Meta analysis 2: 1

How to quantify between-study heterogeneity in single-arm evidence synthesis

Ulrike Held1, Lea Bührer1,2, Beatrix Latal3, Stefania Iaquinto1

1Epidemiology, Biostatistics and Prevention Institute, University of Zurich, Switzerland; 2Centre for Computational Health, Institute of Computational Life Sciences, Zurich University of Applied Sciences (ZHAW), Wädenswil, Switzerland; 3Child Development Center, University Children's Hospital, University of Zurich, Switzerland

Background: Random-effects meta-analysis models account for between-study heterogeneity by estimating and incorporating the heterogeneity variance parameter τ2. Different estimators for τ2 have been proposed, but no widely accepted guidance exists on when to best use which meta-analysis variance estimator. In the context of meta-analysis of single-arm observational studies, studies with unique challenges, such as large variability in outcomes, sparse data, and high methodological heterogeneity, systematic evaluations and comparisons of the different heterogeneity variance estimators are lacking. A neutral comparison simulation study was conducted to represent typical meta-analysis scenarios for continuous and binary outcomes in a single-arm meta-analysis setting. Furthermore, a non-systematic literature review was conducted, and the methods were applied to a case study involving infants with congenital heart disease (1).

Methods: Seven different estimators of τ2 were preselected based on their use in clinical research and their availability in the R programming environment. Their performance was assessed in terms of mean bias, mean squared error, and the proportion of estimates equal to zero. Additionally, coverage and bias-eliminated coverage were evaluated using Wald and Hartung-Knapp confidence intervals. Prediction intervals were additionally calculated. In a non-systematic literature review, we assessed which meta-analysis methods are currently used in high-ranked medical journals.

Results: Our neutral comparison simulation study showed imprecision across all heterogeneity variance estimators, particularly in meta-analyses with a small number of studies or when analysing binary outcomes with rare events. Many heterogeneity variance estimators frequently produced zero heterogeneity estimates, even in the presence of heterogeneity. Notably, while the estimated overall effects remained relatively robust, prediction intervals varied substantially across methods. Additionally, our literature review indicated a low level of statistical literacy regarding heterogeneity variance estimators in single-arm meta-analyses, with over half of the reviewed studies failing to report the estimator used. A preprint of our study is available (2).

Conclusion: We conclude that relying on a single heterogeneity variance estimator is not appropriate for single-arm meta-analysis of observational studies. Instead, we recommend using multiple estimators in a sensitivity analysis, especially when evaluating prediction intervals.

References

(1) Feldmann, M., Bataillard, C., Ehrler, M., Ullrich, C., Knirsch, W., Gosteli-Peter, M. A., Held, U., Latal, B. (2021). Cognitive and Executive Function in Congenital Heart Disease: A Meta-analysis. Pediatrics, 148(4), e2021050875. https://doi.org/10.1542/peds.2021-050875

(2) Iaquinto, S., Feldmann, M., Latal B. et al. How to quantify between-study heterogeneity in single-group evidence synthesis? - It depends!, PREPRINT (Version 1) available at Research Square [https://doi.org/10.21203/rs.3.rs-2450618/v1]



28-1 Meta analysis 2: 2

Bayesian random-effects meta-analysis with empirical heterogeneity priors for HTA applications in the situation of very few studies

Ralf Bender1, Jona Lilienthal1, Sibylle Sturtz1, Christian Röver2, Tim Friede2

1Department of Medical Biometry, Institute for Quality and Efficiency in Health Care (IQWiG), Cologne, Germany; 2Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany

Background:

In Bayesian random-effects meta-analysis, the use of (weakly) informative heterogeneity priors is of particular benefit in the case of very few studies, a situation often encountered in HTA applications [1]. Empirical heterogeneity priors derived from Cochrane reviews are available but it is unclear whether these are adequate for HTA applications [5]. Different heterogeneity priors have been proposed in the literature as well as methods to derive prior distributions from empirical meta-analyses [3-5].

Methods:

We collected all relevant meta-analyses from IQWiG reports for the period 2005 to 2021. We considered the effect measures SMD for continuous data, HR for time-to-event data, and OR and RR for binary data. The heterogeneity parameter was re-estimated by applying random-effects meta-analyses with the Knapp-Hartung and the Paule-Mandel method. The hierarchical Bayesian model proposed by Röver et al. [4] was applied to derive empirical heterogeneity priors for the different effect measures. We compared these with previous proposals for heterogeneity priors [3,5] and compared the meta-analytic results of the Bayesian approach with those from the former IQWiG approach for evidence synthesis in the case of very few studies.

Results:

Empirical heterogeneity priors based on the half-normal distribution are derived, which have more distributional weight on smaller heterogeneity values than previous suggestions. Evidence synthesis based on the new heterogeneity priors more frequently allows for a quantification of the treatment effect than the former IQWiG approach [2].

Conclusions:

The new heterogeneity priors are suitable for the application of Bayesian random-effects meta-analyses with very few studies in the HTA framework.

References

[1] Bender, R. et al. (2018): Methods for evidence synthesis in the case of very few studies. Res. Syn. Methods 9, 382–392.

[2] Lilienthal, J. et al. (2024): Bayesian random-effects meta-analysis with empirical heterogeneity priors for application in health technology assessment with very few studies. Res. Syn. Methods 15, 275-287.

[3] Röver, C. et al. (2021): On weakly informative prior distributions for the heterogeneity parameter in Bayesian random-effects meta-analysis. Res. Syn. Methods 12, 448-474.

[4] Röver, C. et al. (2023): Summarizing empirical information on between-study heterogeneity for Bayesian random-effects meta-analysis. Stat. Med. 42, 2439-2454.

[5] Turner, R.M. et al. (2015): Predictive distributions for between-study heterogeneity and simple methods for their application in Bayesian meta-analysis. Stat. Med. 34, 984-998.



28-1 Meta analysis 2: 3

Assessing inconsistency in flexible meta-regression using network meta analysis

Marc Angelo Parsons, Andrea Benedetti, Russell Steele

McGill University, Canada

Background

Medical research is often interested in changes in health outcomes over time. These are called trajectories. While flexible regression methods such as spline or fractional polynomial models are well developed for single studies, the case is less clear in the case of trajectory meta-analysis (MA) of multiple primary studies in the presence of differences in outcome assessment patterns between studies.

Network meta-analysis (NMA) simultaneously estimates multiple pairwise effect differences between a batch of treatments. This method has well-defined assumptions and an accepted framework for evaluating results. The fundamental assumption of consistency between direct and indirect comparisons can be quantified in several ways. By measuring consistency in a longitudinal context, we could assess the impact of heterogeneous outcome assessment patterns.

Methods

We present a novel application of NMA to the estimation of longitudinal trajectories: Discrete Time NMA (DTNMA). Briefly, DTNMA considers pairwise comparisons between discrete timepoints rather than treatments, leveraging both direct and indirect evidence to plot out an estimated trajectory. This paper outlines the underlying methodology, including assumptions, of DTNMA. Two case studies are presented: one using the motivating dataset presented above and another employing a systematic review of depression scores measured during the COVID-19 pandemic.

Results

In the first case study, we found moderate evidence for a decline in depression scores over the perinatal period. In the second case study, we found no evidence for change in depression scores over the two years beyond the start of the COVID-19 pandemic. However, for both case studies, we found evidence for high levels of heterogeneity and inconsistency between studies due to differing outcome assessment patterns using the DTNMA models.

Conclusion

The novel method presented in this paper (DTNMA) provides researchers with a method to assess inconsistency in MA of trajectories due to irregular outcome assessment patterns between included studies. In standard flexible meta-regression, it is not clear how to measure the effect of this issue on model results. We have shown that using an NMA approach can provide a possible pathway to resolve this problem. In the case studies presented, high levels of heterogeneity and inconsistency in the patterns of outcome assessment may limit the final conclusions of model results. We have demonstrated how inconsistency in an NMA can provide researchers with insight into the previously unconsidered issue of differing outcome assessment patterns between included studies.



28-1 Meta analysis 2: 4

Illuminating the Assumptions of Meta-Regression in Treatment Networks

Nana-adjoa Kwarteng1, Theodoros Evrenoglou1, Adriani Nikolakopoulou1,2

1Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Centre, University of Freiburg, Freiburg im Breisgau, Germany; 2Department of Hygiene, Social-Preventive Medicine and Medical Statistics, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece

Background / Introduction

Network meta-analysis (NMA) is a common statistical method used to synthesize evidence from multiple studies and enable the simultaneous comparison of multiple competing treatments for a given condition. When meta-analysts wish to explore remaining heterogeneity, Network meta-regression (NMR) provides a valuable extension to NMA by adjusting treatment effect estimates based on study-level characteristics in an attempt to further explain heterogeneity across studies. Despite its usefulness, NMR adoption as a technique has been limited due to challenges regarding conceptual issues, implementation accessibility, and limitations in sparse data settings. NMR model coefficients follow independent, exchangeable, or common assumptions, each with or without consistency in treatment comparisons. However, choosing between these models is complex, as different modeling assumptions can lead to varying results and interpretations depending on factors like model fit and data availability. Additionally before fitting NMR models, meta-analysts must examine the relevance of data directionality, where certain study characteristics (e.g., sponsorship) systematically influence treatment effect estimates, introducing potential bias.

Methods

To address these challenges, we introduce frequentist tools and a graphical toolkit for NMR models, extending the landscape of implementation tools available to analysts. Additionally, we investigate different modeling assumptions and explore their properties under various scenarios. Based on the properties of the available model assumptions, we provide recommendations for implementation of NMR when considering data availability and the research question of interest. We also provide guidance for the interpretation of treatment-by-covariate interactions in relation to their underlying NMR assumptions.

Results

We illustrate our methods by comparing different NMR modeling assumptions in a network of 10 diabetes treatments, and simulations with various data generation scenarios. Detailed examination of the output highlights the importance of directionality in models that do not impose consistency in the treatment-by-covariate interactions. Furthermore, results emphasize complexities in scenarios with small subgroups with a lack of sufficient variation in the observed covariates.

Conclusion

This work elucidates some NMR assumptions, the importance of the network structure, and the importance of small data for NMR. The introduced tools can streamline the implementation of NMR, facilitating the exploration of sources of heterogeneity and inconsistency in NMA and expanding tools available to meta-analysts of evidence synthesis.



28-1 Meta analysis 2: 5

Transitivity in Network Meta-Analysis: A Formal Framework and Practical Implications

Noosheen Rajabzadeh Tahmasebi1, Ian White2, Georgia Salanti3, Efthimiou Orestis3,4, Adriani Nikolakopoulou1,5

1Institut für Medizinische Biometrie und Statistik (IMBI), Universitätsklinikum freiburg (Germany); 2MRC Clinical Trials Unit, University College London (UK); 3Institute of Social and Preventive Medicine (ISPM), University of Bern (Switzerland); 4Berner Institut für Hausarztmedizin (BIHAM), University of Bern (Switzerland); 5Laboratory of Hygiene, Social and Preventive Medicine and Medical Statistics, Aristotle University of Thessaloniki (Greece)

Network meta-analysis (NMA) synthesizes evidence for multiple treatments, relying on assumptions of transitivity and consistency. Transitivity has been widely discussed, and many interpretations and practical considerations have been suggested. However, the formal definition of transitivity and its relation with consistency remain ambiguous. We propose a clear definition of transitivity, viewing it as a property of counterfactual treatment effects derived from joint randomizability. In particular, we define transitivity as the equality of true effects between treatments X and Y across studies, even when a study does not include X or Y. Subsequently, we show how consistency, defined as the agreement between different sources of evidence, can be derived as a consequence of this definition. We show how common interpretations of transitivity relate to its formal definition. We then link the transitivity assumption with assumptions about missing data mechanisms. Specifically, we demonstrate that transitivity is equivalent to the assumption of missing completely at random and that a weaker assumption, missing at random, suffices for valid NMAs when paired with likelihood-based analyses. Our findings highlight key properties of the assumptions of NMA and call for careful examination of these assumptions (e.g., through examining the distributions of effect modifiers) to enhance the robustness of evidence synthesis.