07-estimands-clinical: 1
What is the estimand for the proportional odds model?
Ian R White, Brennan C Kahan
UCL, United Kingdom
Background. The proportional odds model is frequently used to analyse an ordered categorical outcome in clinical trials. It assumes that the odds ratio is the same for all dichotomisations of the outcome.
We wanted to know what estimand the model estimates if this proportional odds assumption is false. The log odds ratio from a proportional odds model is known to be a function of the probability that a randomly selected person from the treatment group has a better outcome than a randomly selected person from the control group. However, this is hard to interpret. To interpret the log odds ratio from a proportional odds model better, we explore whether it can be expressed as a difference of an average score between the two groups (where the score is some transformation of the outcome).
Methods. We propose two novel approximate approaches which demonstrate how the proportional odds model contrasts the average scores between groups. Mathematically, we derive a closed-form expression for the maximum likelihood estimator for data near the null. Computationally, we estimate scores by equating influence (change in the log odds ratio due to omitting one observation) between the proportional odds model and a linear regression for scores. The methods are illustrated using the results of the FLU-IVIG trial.
Results. The mathematical derivation shows that the log odds ratio can be approximated as a difference of mean scores between groups, where the score for each outcome level is a linear function of the proportion below that level plus half the proportion at that level. The computational method agrees well with the mathematical expression, and can also be used away from the null or with covariate adjustment.
The FLU-IVIG trial had rare categories representing very poor clinical outcomes. The derived scores implied by the proportional odds model are shown to be very similar for these worst categories. This may be undesirable, since they have very different clinical importance.
Conclusion. The estimand for the proportional odds model is a difference of mean scores which are implicitly assigned by the model. Adjacent small categories have very similar implicit scores. Deriving scores provides a way to discuss the proportional odds model with clinicians, who can thus decide whether it suitably reflects clinical importance.
07-estimands-clinical: 2
On the use of the net treatment benefit as a treatment-effect measure in randomized clinical trials
Tomasz Burzykowski1,2, Vaiva Deltuvaite-Thomas2
1Hasselt University, Belgium; 2IDDI, Belgium
Generalized pairwise comparisons (GPC) is a non-parametric method designed to quantify the benefit of a new treatment, as compared to a control one, by using a set of hierarchically-ordered endpoints. GPC yields an estimate of the treatment effect, the so-called net treatment benefit (NTB). For a single endpoint, NTB(d) = P(XE-XC>d)- P(XE-XC<-d), where XE and XC is the value of the endpoint for the experimental and control group, and d is the threshold of clinical relevance. In this case, NTB(d) is simply the difference between the probability that the value of the endpoint for the experimental treatment is clinically “better” than for the control treatment and the probability that the value of the endpoint for the control treatment is “better” than for the experimental treatment.
GPC and NTB are being promoted as the approach that allows benefit-risk assessment and that is useful for personalized medicine. In this paper, we take a critical look at some of the properties of NTB that may be important from a point of view of using it as a treatment-effect measure in randomized clinical trials. Several of the properties have been already investigated in the literature. For instance, NTB may be trial‑specific, because it depends on the variability of endpoint(s) which may change from trial to trial. NTB is not robust to missing data, and its application to right-censored data requires the use of corrections. We additionally show that, in case hidden strata are present in a population, they may lead to biased estimation of NTB in a clinical trial. For instance, it is possible to obtain a non-zero NTB value when the true value is zero. We also show that NTB may suggest no benefit of either treatment when, in fact, the patients receiving an experimental treatment fare, on average, worse on it (as compared to patients receiving the control treatment) when the treatment does not work than when the treatment does work.
07-estimands-clinical: 3
Challenges and opportunities in defining new estimands for longitudinal outcomes truncated by death
Juliette ORTHOLAND1, Marie-Abèle BIND2
1Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié Salpêtrière, F-75013, Paris, France; 2MGH Biostatistics, Massachusetts General Hospital, Harvard Medical School, Boston
Introduction Quality of life, often measured through repeated measures in clinical trials, is a meaningful outcome for patients, even for deadly diseases. Nevertheless, studying longitudinal outcomes that are truncated by death raises methodological challenges in defining suitable estimands. Indeed, when patients die during a clinical trial, their longitudinal outcomes are not defined at the end of the clinical trial. Although a few estimands have been proposed to address this issue, many lack a formal causal interpretation.
Objective In our work, in line with guidelines of the European Medicines Agency, we discuss the relative advantages of different causal estimands for longitudinal outcomes truncated by death.
Method First, to better understand the challenges of estimands in this context, we start with a set-up without truncation. Then, we compare the different estimands proposed in the literature from a causal point of view and discuss their estimation. Finally, we illustrate this work using clinical trial data from amyotrophic lateral sclerosis (ALS).
Results We find two primary challenges in defining estimands for longitudinal outcomes truncated by death. First, defining an order of better health in this situation is not straightforward. Second, when the average causal effect is used as population-level causal effects, a notion of distance between dead and alive individuals must be defined, in addition to an order of better health.
We show that principal stratification allows us to avoid the need for an ordering between dead and alive individuals by focusing on an "always survivor" subgroup. Furthermore, when assuming death as the worst outcome (for the sake of defining an order), we present the causal framework of other estimands, such as combined scores and the survival-incorporated median, which rely on well-selected population-level causal effects to circumvent the issue of defining a distance between dead and alive individuals.
Our work also shows how these estimands target different scientific questions and highlights how the use of estimators can help alleviating some assumptions necessary for estimation.
Conclusion This work offers insights into better incorporating quality of life as an endpoint in clinical trials and provides a clearer framework for utilizing powerful estimators in the challenging context of longitudinal outcomes truncated by death.
07-estimands-clinical: 4
What estimands can and should be used when interventions may be given more than once?
Joanna Anetta Hindley1, Michelle Clements1, James Carpenter1,2, Brennan Kahan1
1University College London, United Kingdom; 2London School of Hygiene and Tropical Medicine, United Kingdom
Introduction In many settings, patients may require treatment more than once. For instance, patients who experience asthma exacerbations might require medication at each onset of symptoms, and those undergoing IVF may receive several rounds of treatment until they become pregnant. Historically, most randomised trials only allowed participants to enrol for a single ‘treatment episode’, but there is growing recognition that allowing participants to re-enrol for each new episode they experience can be beneficial. However, doing so poses additional challenges around defining the estimand, as investigators need to consider issues such as how different patients experience different numbers of episodes, or the fact that patients may be assigned different treatments at different episodes. To date, there has been little work on categorising the variety of estimands that can be used in these settings, comparing the interpretability of each, or evaluating statistical estimators.
Methods/Approach We define a range of estimands that can be used in these ‘multi-episode’ settings. We discuss the interpretation of each and their clinical relevance in settings like IVF or asthma exacerbations, then evaluate the performance of different trial designs (re-randomisation and single-patient cluster randomisation) and statistical estimators for each using a simulation study. Within this simulation we allow for different data-generating models for both outcome measure and episode occurrence.
Results We define 8 estimands suitable for settings where treatments may be given more than once. Of these, 6 can be estimated under minimal assumptions so long as an appropriate design and estimator are used; however, the remaining 2 estimands require strong, untestable assumptions, and are therefore unlikely to be useful in a primary analysis. By considering the contexts of IVF and asthma exacerbations we find that the relevance of each estimand may vary depending on clinical context, highlighting the importance of careful consideration of this issue at the trial design stage.
Discussion We describe estimands suitable for use in trials evaluating treatments that may be given more than once, show that most can be estimated under minimal assumptions and explore considerations for choosing the most useful estimand. This work brings clarity to the design and analysis of clinical trials in settings that are common across medical specialties.
07-estimands-clinical: 5
Current practice around the use of estimands in cluster randomised trials, and the impact of informative cluster size on inferences
Dongquan Bi, Andrew Copas, Brennan Kahan
Institute of Clinical Trials and Methodology, University College London, United Kingdom
Introduction
The use of estimands helps clarify the treatment effect a study aims to estimate. Cluster randomised trials (CRTs) can address the participant-average (PA) or cluster-average (CA) effects. When outcomes and/or treatment effects vary across clusters depending on cluster size (also called ‘informative cluster size’ (ICS)), these two effects can differ, and estimators that target one can be biased for the other. It has been recently shown that common estimators for CRTs such as generalised estimating equations with an exchangeable correlation structure (GEE) and mixed-effects models can produce biased estimates for both PA and CA effects under ICS. However, current practice around the use of estimands in CRTs and the likely impacts of ICS remain unknown. We therefore set out to establish the current practice through a review and explore the potential impacts of ICS in a re-analysis of a CRT.
Methods
We conducted a review of recently published CRTs to explore which estimands are targeted, which estimators are used, and how often potential impacts of ICS are considered. We then reanalysed the RESTORE trial, which randomised 31 US paediatric intensive care units to compare protocolised sedation with usual care for critically ill children. For each outcome, we compared the PA and CA effect estimates from independence estimating equations (IEE) which is robust to ICS, to evaluate the likelihood of ICS. Next, we compared estimates from GEE/mixed-effect models against IEE estimates to evaluate the extent to which results from GEE and mixed-effects models may have been affected by ICS. We used bootstrapping to evaluate how much the difference could be due to chance.
Results
No trial (0/73) tried to report estimands. The research question was not clear in most trials (58/73). For many trials (46/73), it was not inferable whether they were targeting a PA or CA estimand. Trials often used GEE or mixed models as the primary estimator (37/73). The potential impacts of ICS were rarely considered.
The re-analysis found that the PA and CA estimates differed for most outcomes (18/22), indicating a possible presence of ICS in the RESTORE trial. For instance, for the iatrogenic withdrawal outcome, the PA OR estimate was 1.35 (95% CI, 0.72 - 2.51), and the CA OR estimate was 0.81 (95% CI, 0.39 – 1.69).
Potential impact
Trialists should describe estimands in CRTs which helps to ensure that research question investigated can be understood and appropriate statistical methods aligned with the question are used.
|