Statistical Week 2025
2-5 September 2025
Wiesbaden, Germany
Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
| Session | ||
CSDS2: Computational Statistics and Data Science 2
| ||
| Presentations | ||
9:00am - 9:25am
Shrinkage Bayesian Causal Forest with Instrumental Variable 1Universität Duisburg-Essen, Deutschland; 2Ruhr Graduate School in Economics, Essen, Deutschland This paper proposes a novel framework for estimating heterogeneous treatment effects using Instrumental Variables (IV) in observational studies with sparse data and imperfect compliance. To address these limitations, we build upon the Bayesian Instrumental Variable Causal Forest (BCF-IV) framework that has been developed to estimate the conditional Complier Average Causal Effect (CACE) non-parametrically while retaining interpretability. BCF-IV uses Bayesian Additive Regression Trees (BART) to identify treatment effect heterogeneity and to estimate the conditional CACE based on the conditional Intention-To-Treat (ITT) effects and the proportion of compliers. Our approach extends BCF-IV by proposing a Shrinkage Bayesian Instrumental Variable Causal Forest (SBCF-IV) algorithm. SBCF-IV adopts the SoftBART algorithm and makes two major contributions. First, SBCF-IV implicitly discriminates between relevant and irrelevant covariates when estimating conditional ITT effects and proportions of compliers. Secondly, our approach implements varying posterior splitting probabilities from SoftBART into the discovery of heterogeneous subgroups. These modifications enhance SBCF-IV’s ability to handle sparse data and to detect variables that drive the heterogeneity of treatment effects. A simulation study suggests that a more precise estimation of conditional CACE can be achieved while maintaining interpretability, particularly in scenarios with sparsity, confounding, and nonlinearity. In an empirical application, we revisit the Oregon Health Insurance Experiment to demonstrate the use of SBCF-IV in comparison to BCF-IV and discuss the differences in the estimates for the conditional CACE. 9:25am - 9:50am
Bayesian Causal Forests for Cost-Effectiveness Analysis 1Universität Duisburg-Essen, Deutschland; 2Universität zu Köln, Deutschland; 3Ruhr Graduate School in Economics, Essen, Deutschland We introduce a novel approach that combines Bayesian Causal Forests (BCF) with Cost-Effectiveness Analysis (CEA) to assess effect heterogeneity of a binary treatment under unit-varying costs. Recently, CEA Forests have been proposed to estimate heterogeneous effects in a frequentist setting using Generalized Random Forests (GRF). This approach requires, first, estimating the uplift of the cost and the outcome effect, and, second, differencing the unit-level effects. We translate CEA Forests into a Bayesian framework, which holds promise for improving accuracy, facilitating seamless uncertainty quantification, and more effectively capturing sparsity within the underlying data generating process. A simulation study illustrates guidelines for visual and metric-based CEA using our approach. In comparison to the GRF-based method, we find that our method outperforms the CEA Forests, especially under smoothness in the covariate signal and in small samples. 9:50am - 10:15am
Efficient nonparametric estimation of Markov-switching models Universität Bielefeld, Deutschland Markov-switching models are powerful tools that allow capturing complex patterns from time series data driven by latent states. Recent work has highlighted the benefits of estimating components of these models nonparametrically, enhancing their flexibility and reducing biases, which in turn can improve state decoding, forecasting, and overall inference. Formulating such models using penalised splines is straightforward, but practically feasible methods for a data-driven smoothness selection in these models are still lacking. Traditional techniques, such as cross-validation and information criteria-based selection suffer from major drawbacks, most importantly, their reliance on computationally expensive grid search methods, hampering practical usability for Markov-switching models. An alternative approach treats spline coefficients as random effects and employs marginal likelihood maximisation via the TMB R package, avoiding grid search but introducing a computationally demanding nested optimisation problem and potential numerical instability. As an alternative, we propose using the so-called extended Fellner-Schall method for smoothness selection, which leverages the relatively simple structure of penalised splines treated as random effects. This method provides an efficient and general mechanism for smoothness selection, avoiding the need for nested optimisation and higher-order derivatives, improving numerical stability, and significantly reducing computational costs. Our approach enables the practical estimation of flexible Markov-switching models, even in complex settings. 10:15am - 10:40am
Time series forecasting in SAP using a data-driven semiparametric ARMA model 1Universtät Paderborn, Deutschland; 2MHP Management und IT Beratung GmbH Motivated by more and more semi- or nonparametric models applied in time series forecasting and their demonstrated superior performance in many empirical researches, this paper explores the adoption and integration of a semiparametric ARMA model in an enterprise system landscape. We begin by reviewing basic construction of the semiparametric ARMA model, the iterative plug-in algorithm for estimating the trend component of trend stationary times series, forecast techniques and quality measurements, which were well researched and published with the R package smoots. Subsequently, we showcase a novel approach to adopt the semiparametric ARMA model in a forecast application based on SAP Analytics Cloud (SAC), which leverages the platform’s strengths in system integrity, state-of-the-art user interface (UI) design as well as seamless connection to a R engine with smoots package embedded. The forecast application addresses key challenges in terms of cost efficiency, user experience, and the requirement for in-house statistical or machine learning expertise while adopting such statistical algorithms in enterprise context. Finally, we empirically evaluate the forecast quality of the integrated semiparametric ARMA model using real-world data, demonstrating promising results overall. | ||
