Conference Agenda

Session
Meta science 2
Time:
Tuesday, 26/Aug/2025:
11:30am - 1:00pm

Location: ETH E23

D-BSSE, ETH, 84 seats

Presentations
29-1 Meta science 2: 1

Developing communication skills in biostatistical consulting and collaboration

Karen Lamb, Sabine Braat, Julie Simpson

University of Melbourne, Australia

Communication is a key skill in biostatistical consultancy and yet it is something that is not emphasised in our training. One of the most challenging things about embarking on a career in statistical consulting is learning how to say “no” or, more often, “not like that” or “not right now”. Although it is important to say “yes” to many projects for strategic or financial purposes, to develop new expertise, or due to interest in the topic, there are also many reasons and ways to say no. Of particular concern is feeling you have to say “yes” to something that is clearly fraudulent or unethical. In a 2018 US study of 390 consulting biostatisticians, findings showed that researchers often make “inappropriate requests” of statisticians, ranging from being asked to fake statistical significance, to changing or removing data1. Clearly, it is important that biostatisticians feel able to say no, but this can be difficult, particularly for early career biostatisticians who feel under pressure to acquiesce to the requests of their seniors.

The Methods and Implementation Support for Clinical and Health research (MISCH) Hub provides support in key aspects of clinical and health research to researchers at the University of Melbourne and affiliated hospital partners, including biostatistics, health economics and co-design. As co-Head of biostatistics for MISCH, one aspect of my role is to assist biostatisticians to develop confidence in how to effectively communicate with researchers. This includes helping develop the ability to negotiate with collaborators. In this presentation, I will describe situations in which a “no”, “not like that” or “not right now” has been the approach I have opted for and how I negotiated alternative solutions for the collaborator. The case study examples range from grant applications with inappropriate study designs, convincing researchers to move on from ANOVA, incorporating the estimand framework in trials, and negotiating achievable deadlines with collaborators. In addition, I will describe some approaches we use within MISCH to support the development of communication skills in our team to help them in their roles as biostatistical consultants within academia.

REFERENCES

1Wang et al. Researcher Requests for Inappropriate Analysis and Reporting: A U.S. Survey of Consulting Biostatisticians. Ann Intern Med 2018,169(8):554-558.



29-1 Meta science 2: 2

Are we “Essential” or “Not Needed”? Varying perceptions of statisticians' value, a national study of human research ethics committees

Adrian Barnett1, Nicole White1, Taya A Collyer2

1Australian Centre for Health Services Innovation and Centre for Healthcare Transformation, Queensland University of Technology; 2National Centre for Healthy Ageing, Monash University, Australia

Background: Inappropriate study design and statistical analysis leads to research waste, and wasted participant effort. Proposed clinical research is generally reviewed by an ethics committee, and ethical review represents a key opportunity for inappropriate designs to be identified and remedied. However, the availability and quality of statistical advice for ethics committees is inconsistent, due to a lack of standardisation. In Australia, the proportion of committees with access to a formally-qualified statistician is unknown, and the beliefs and attitudes regarding the role of statisticians in ethical review are unclear.
Methods: To explore these issues, we approached all human research ethics committees in Australia, to complete an online survey with open and closed questions about the role of statistical advice in the committee's work. Structured questions were analysed descriptively, and text responses to open questions were analysed qualitatively, via thematic analysis with both inductive and deductive code-sets.
Results: Sixty percent of committees reported access to a statistician, either as a full committee member or as a non-member who could be consulted, but the reduced to 35% when accounting for formal statistical qualifications. Many committees reported relying on "experienced" or “highly numerate” researchers in place of qualified statisticians, as they view general research experience and advanced statistical training as equivalent.

Committees without access to statisticians tended to locate responsibility for study design with other parties, including researchers, trial sponsors, and institutions. Some committee chairs viewed formal statistical input as essential to the work of their committee; however, amongst those who viewed statistical advice as unimportant or unnecessary, there was a widespread belief that statistical review is only applicable to particular kinds of studies, and that “simple”, observational or “small” studies do not merit statistical review.
Conclusion: We encountered dramatic and surprising variance in practice and attitudes towards the role of statisticians on human research ethics committees. Concerningly, qualitative analysis revealed that some practices and attitudes are underpinned by beliefs about statistics and statisticians which are demonstrably incorrect. The number of research studies receiving approval without statistical review in Australia is concerning, risking studies that in the best-case waste resources, and in the worst-case cause harm due to flawed evidence.



29-1 Meta science 2: 3

An Empirical Assessment of the Cost of Dichotomization

Erik van Zwet1, Frank Harrell2, Stephen Senn3

1Leiden University Medical Center, The Netherlands; 2Vanderbilt University Medical Center, Tennessee, US; 3Edinburgh, UK

Background We consider two-arm parallel clinical trials. It is well known that binary outcomes are less informative than continuous (numerical) ones. This must be compensated by larger sample sizes to maintain sufficient power.

Methods If the continuous outcome has the normal distribution, then the standardized mean difference (SMD) and the probit transformation of the dichotomized outcome are both estimates of Cohen’s d. We use this equivalence to study the loss of information due to dichotomization. We have use 21,435 unique randomized controlled trials (RCTs) from the Cochrane Database of Systematic Reviews (CDSR). Of these trials, 7,224 (34%) have a continuous (numerical) outcome and 14,211 (66%) have a binary outcome. We find that trials with a binary outcome have larger sample sizes on average, but also larger standard errors and fewer statistically significant results.

Results We conclude that researchers do tend to increase the sample size to compensate for the low information content of binary outcomes, but not nearly sufficiently.

Conclusion In many cases, the binary outcome is the result of dichotomization of a continuous outcome which is sometimes referred to as “responder analysis”. In those cases, the loss of information is avoidable. Burdening more subjects than necessary is wasteful, costly and unethical. We provide a method to calculate by how much the sample size may be reduced if the outcome would not be dichotomized. We hope that this will guide researchers during the planning phase. We also provide a method to calculate the loss of information after a responder analysis has been done. We hope that this will motivate researchers to abandon dichotomization in future trials.



29-1 Meta science 2: 4

Reporting Completeness in Conventional and Machine Learning-Based COVID-19 Prognostic Models: A Meta-Epidemiological Study

Ioannis Partheniadis, Persefoni Talimtzi, Adriani Nikolakopoulou, Anna Haidich

Aristotle University of Thessaloniki, Greece

Background: The rapid publication of prognostic prediction models for COVID-19 presents an opportunity to study the evaluation of the reporting completeness, to elucidate the potential of their clinical applicability. This study assesses and compares the reporting completeness of conventional and machine learning-based models using the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) [1] and its AI extension (TRIPOD+AI) [2].

Methods: We included studies reporting the development, and internal and external validation of prognostic prediction models for COVID-19 using either conventional or machine learning-based algorithms. Literature searches were conducted in MEDLINE, Epistemonikos.org, and Scopus (up to July 31, 2024). Studies using conventional statistical methods were evaluated under TRIPOD, while machine learning-based studies were assessed using TRIPOD+AI. Data extraction followed TRIPOD and TRIPOD+AI checklists, measuring adherence per article and per checklist item. The protocol was registered on the Open Science Framework (https://osf.io/kg9yw).

Results: We identified 53 studies reporting 71 prognostic models. On average, studies using conventional models adhered to 38.1% (SD: ±10.4) of the TRIPOD checklist, while machine learning-based studies adhered to 28.37% (SD: ±8.91) of TRIPOD+AI. No study fully adhered to abstract reporting requirements, and few included an appropriate title (29.0%, 95% CI: 16.1–46.6 for TRIPOD; 13.6%, 95% CI: 4.8–33.3 for TRIPOD+AI). Notably, no study fully reported a sample size assessment. Reporting of methods and results sections was poor across both frameworks. Overall, adherence to TRIPOD and TRIPOD+AI guidelines was generally low, with machine learning-based models showing significantly lower overall adherence (28.4% vs. 38.1%) (p < 0.001). The lower adherence to the TRIPOD+AI statement was somewhat expected, as these guidelines were published in April 2024 [2], two years after the most recent study included in this analysis.

Conclusion: Reporting completeness was inadequate for both conventional and machine learning-based models, with critical omissions in model specifications and performance metrics. Strengthening adherence to reporting guidelines is essential to enhance research transparency, prevent research waste, and improve clinical utility.

[1] Collins GS, Reitsma JB, Altman DG et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement. BMC Med 13, 1 (2015). doi.org/10.1186/s12916-014-0241-z.

[2] Collins GS, Moons KGM, Dhiman P, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 385, e078378 (2024). doi:10.1136/bmj-2023-078378.



29-1 Meta science 2: 5

How to improve data quality reporting: a real-world data example using registry data

Elena Salogni1, Thomas J. Musholt2, Elisa Kasbohm1, Stephan Struckmann1, Carsten Oliver Schmidt1

1Institute for Community Medicine, University Medicine Greifswald, Greifswald, Germany; 2Section of Endocrine Surgery, Department of General, Visceral and Transplantation Surgery, University Medical Centre, Johannes Gutenberg-University Mainz, Mainz, Germany

Background / Introduction

Knowing the properties and quality of research data is essential for credible research findings. However, achieving a comprehensive overview can be difficult if the data have been obtained outside a highly controlled environment. This is often the case with real-world data collections, such as clinical registries. Ideally, registry administrators would provide extensive data overviews in a standardized manner. However, this is not routinely the case. This talk presents an information workflow to transparently check data with the R package dataquieR [1]. We illustrate how this approach provides clear and actionable insights on data quality issues using the Eurocrine registry, a European registry that contains data on the diagnosis and surgical treatment of endocrine tumors and diseases.

Methods

The Eurocrine registry contains data from more than 170.000 patients and almost 150 clinics across 17 European countries. Assessed data cover preoperative, operative, postoperative, and follow-up data. Metadata defining expectations for the data were assembled in an Excel worksheet. With a single command call, dataquieR produces data quality reports comprising descriptive statistics and up to 24 data quality indicators related to data formatting, missing data, range violations, outliers, contradictions, cluster effects, and time trends. All findings are assembled in an extensive html report and can be used for subsequent data corrections.

Results

Due to the numerous calculations, the computational time (Windows 10, 128GB RAM, Core i7-12 cores) was approximately 10-12 hours. The report reveals issues related to data formatting errors, completeness, and data correctness. All of the mandatory variables are almost 100% complete. A few inadmissible values (<2%) and implausible data (e.g., postoperative serum calcium level) can be observed. Contradictions checks have a percentage of violation below 2%.

Conclusion

The presented approach is generic and can be applied similarly to other data sources. In the Eurocrine application example, previously unknown issues were discovered despite existing measures to secure a high data quality. Findings can be of relevance for decisions on statistical analyses but also as part of a data monitoring to improve, where possible, the quality of the data base.

[1] Struckmann S., et al. (2024). dataquieR 2: An updated R package for FAIR data quality assessments in observational studies and electronic health record data. JOSS; 9(98):6581. 10.21105/joss.06581.