9C - Statistical modeling and data analysis 2
Social sciences, Bayesian models, Statistical models, Time series
Session Sponsor: R Consortium Session Slide
8:00pm - 8:20pm
ID: 145 / ses-09-C: 1
Topics: Time series
Keywords: Bayesian analysis, Markov chain Monte Carlo, state space models, R package, sequential Monte Carlo
bssm: Bayesian Inference of Non-linear and Non-Gaussian State Space Models in R
University of Jyväskylä
State space models are a flexible class of latent variable models commonly used in analysing time series data. The R package bssm is designed for Bayesian inference of general state space models with non-Gaussian and/or non-linear observational and state equations. The package provides easy-to-use and efficient functions for fully Bayesian inference with common time series models such as basic structural time series model with exogenous covariates, simple stochastic volatility models, and discretized diffusion models, making it straightforward and efficient to make predictions and other inference in a Bayesian setting. Unlike the existing packages, bssm allows for easy-to-use approximate inference based on Gaussian approximations such as the Laplace approximation and the extended Kalman filter. The inference is based on fully automatic, adaptive Markov chain Monte Carlo (MCMC) on the hyperparameters, with optional parallelizable importance sampling post-correction to eliminate any approximation bias. The bssm package implements also a direct pseudo-marginal MCMC and a delayed acceptance pseudo-marginal MCMC using intermediate approximations. The package supports directly models with linear-Gaussian state dynamics with non-Gaussian observation models and has an Rcpp interface for specifying custom non-linear and diffusion models.
8:20pm - 8:40pm
ID: 238 / ses-09-C: 2
Topics: Bayesian models
Keywords: graphical models
bnmonitor: Checking the Robustness and Sensitivity of Bayesian Networks
1Baylor University; 2IE University, Madrid; 3Università di Bologna, Bologna, Italy
Bayesian networks (BNs) are the most common approach to investigate the relationship between random variables. There are now a variety of R packages with the capability of learning such models from data and performing inference. The new bnmonitor R package is the only package which enables users to perform robustness and sensitivity analysis for BNs, both in the discrete and in the continuous case.
Various prequential monitors are implemented to check how well a BN describes a dataset used to learn the model. By checking the elements of the structure, we can adjust the model, presumably the best in the equivalence class, to ensure a good fit. Checking the forecasts that flow from the model allows users to check elements of the model structure in an online setting.
Furthermore the impact of the learned probabilities is investigated using sensitivity functions which describe the functional relationship between an output of interest and the model's parameters.
The output of these monitors are concisely reported via a tailored plot method taking advantage of ggplot2. We illustrate our methods with an example that explores the relationships between body measurements to predict the percentage of body fat. Our example highlights the importance of checking a BN with the appropriate diagnostics.
Link to package or code repository.
8:40pm - 9:00pm
ID: 293 / ses-09-C: 3
Topics: Social sciences
Keywords: matching, causal, observational
FLAME: Interpretable Matching for Causal Inference
Duke University, United States of America
Matching methods are a class of techniques for estimating casual effects from observational data. Such methods match similar units together to emulate the randomization achieved by controlled experiments. Crucially, matching methods rely on a distance measure to determine similarity and thereby match units together. In this talk, we present an R package, FLAME, implementing the Fast, Large-scale Almost Matching Exactly (FLAME) and Dynamic Almost Matching Exactly (DAME) algorithms for performing matching on categorical datasets. These algorithms learn a weighted Hamming distance metric via machine learning on a held out dataset and match units directly on covariate values, prioritizing matches on more important covariates. The R package features an efficient bit-vectors implementation, allowing it to scale to datasets with hundreds of thousands of units and dozens of covariates, with a database implementation under development that allows it to operate on datasets too large to fit in memory. FLAME provides easy summarization, analysis, and visualization of treatment effect estimates, and features a wide variety of options for how matching is to be performed, allowing for users to make analysis-specific decisions throughout the matching procedure. We present an overview of the main functionality of the package and then illustrate an application to the 2010 US NCHS Natality Dataset, in which we study the effect of smoking during pregnancy on NICU admissions.
Link to package or code repository.
9:00pm - 9:20pm
ID: 363 / ses-09-C: 4
Topics: Community and Outreach
R Consortium and You: How you can help us connect the dots
This presentation will showcase what R Consortium is and how it improves the R ecosystem. A lot is going on! We have funded over USD 1,300,000 to the R community. Infrastructure Steering Committee (ISC) and the diverse working group (WG) programs have stimulated conversation and alignment on crucial areas such as industry adoption, package health, and educational standards. This presentation will showcase several working groups and projects funded and shepherded by the R Consortium. The discussion will enable the audience to understand the work being done and the exciting opportunities to participate in these initiatives.