Conference Agenda

Session

Generative AI in clinical research and drug development (BBS invited session)

Time:

Wednesday, 27/Aug/2025:

4:00pm - 5:30pm

Session Chair: Giusi Moffa

Location: Biozentrum U1.111

Biozentrum, 302 seats

Session Abstract

The two talks will be followed by a panel discussion.

Presentations

inv-generative-ai-clinical-dd: 1

(Generative) AI in medicine: Where does statistics come into play?

Sarah Friedrich

University of Augsburg, Germany

Artificial intelligence, including generative models, is transforming medicine through applications in diagnostics, treatment planning, and research. However, the success of these technologies relies not only on advances in machine learning but also on robust statistical foundations. From study design and data quality assessment to the differentiation between correlation and causation, statistics ensures the reliability and interpretability of AI-driven decisions in clinical practice. One key aspect is the evaluation and comparison of AI methods – and the data used for this. While supervised learning typically relies on benchmarking datasets, statistical approaches usually focus on simulation studies. Each approach has its strengths and limitations: benchmarking ensures comparability but may lack generalisability, whereas simulations allow controlled experimentation but may not fully capture real-world complexity. A related aspect is the growing field of synthetic data. It comes with the promise of preserving privacy and expanding the size of training samples in scenarios where obtaining real data is costly, challenging, or infeasible. However, ensuring that it faithfully represents real-world distributions is not straightforward.

In this talk, we will discuss the role statistics plays in AI, particularly focusing on the role of the data basis for training and comparison of models. We will review the advantages and disadvantages of benchmarking vs. simulation studies and touch upon properties, promises and challenges of synthetic data.

inv-generative-ai-clinical-dd: 2

Do Single-Cell Transformers Understand Gene Regulation? A Cautionary Benchmark

Simon Anders¹, Constantin Ahlmann-Eltze¹, Wolfgang Huber²

¹University of Heidelberg, Germany; ²EMBL Heidelberg, Germany

Deep networks are used with great success in some areas of cell biology, such as image-analysis tasks, but progress for functional genomics remains at best mixed. Yet, several recent studies claimed that transformer models, pre-trained on millions of single-cell transcriptomes, can predict the transcriptional outcome of CRISPR Perturb-Seq screens. If true, this would not only allow for replacing wet-lab experiments with in-silico reasoning on a large scale, but also imply that these models have gained actual understanding of the dynamics of gene regulation.

We have benchmarked these recently published pre-trained models against a deliberatively naive linear model that by design cannot capture gene-gene interactions. We found that the deep networks failed to outperform this linear base line. Therefore, claims that deep models can predict unseen experiments must be considered premature.

I will present our study and discuss possible reasons why the models fell short: for example, one key factor might be that transcriptomes were pooled across experiments and provided without metadata. This raises the question what network architecture we would need so allow feeding a model information about the hierarchical structure of the data and about experimental covariates. I will also compare with models that work with DNA sequence data, as there, much progress has recently been achieved in predicting, with cell-type and -state specificity, epigenetic and chromatin features and their effect on gene expression. I will discuss how this highlights inherent differences between omics data and text, which prohibits a direct transfer of the transformer architecture from the (tremendously successful) large-language models to functional genomics tasks, and how progress will need to address these differences in data type by innovative new model architectures. Finally, we also need consensus on robust benchmark schemes, possibly with community challenges, in order to distinguish real progress from overfitting.

inv-generative-ai-clinical-dd: 3

Panel Discussion

Giusi Moffa

Freelance Statistician

Panelists: Simon Anders, Lilla Di Scala, Sarah Friedrich, Holger Fröhlich (virtual), Kostas Sechidis, Mark M.A. van de Wiel, Maarten van Smeden (virtual)