Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview
8B - R in Production 2
Friday, 09/July/2021:
1:45pm - 3:15pm

Session Chair: Zhian N. Kamvar
Session Chair: Matt Bannert
Zoom Host: Pamela Pairo
Replacement Zoom Host: Yuya Matsumura
Virtual location: The Lounge #talk_r_production_2
Session Topics:
R in production

Session Sponsor: Appsilon
Session Slides

1:45pm - 2:05pm
ID: 223 / ses-08-B: 1
Regular Talk
Topics: R in production
Keywords: deployment

Shiny PoC to Production Application in 8 steps

Marcin Dubel


"A great advantage of Shiny applications is that a proof of concept can be created quickly and easily. It is a great way for subject matter experts to present their ideas to stakeholders before moving on to production. However, taking the next step to a production application requires help from experienced software developers. The actions should be focused on two areas: to make the application a great experience for users and to make it maintainable for future work. Focusing on these will assure that the app will be scalable, performant, bug-free, extendable, and enjoyable. Close collaboration between engineers and experts paves a wave to many successful projects in data science and is Appsilon’s confirmed path to production-ready solutions.

The very first step should always be to build a comfortable and (importantly) reproducible workflow, thus setting up the development environment and organizing the folder structure [renv + docker]. Once this is done engineers should proceed to limiting the codebase by cleaning the code – i.e., removing redundant comments, extracting the constants and inline styles [ymls + styler]. Now the fun begins: extract the business logic into separate functions, modules and classes [packages/R6 + plumber]. Restrict reactivity to minimum. Check the logic [data.validator + drake]. Add tests [testthat + cypress/shinytest]. Organize your /www and move actions to the browser [shiny + css/js]. Finally, style the app [sass/bslib + shiny.fluent]. And, voila! A world-class Shiny app."

2:05pm - 2:25pm
ID: 251 / ses-08-B: 2
Regular Talk
Topics: R in production
Keywords: packages, reproducibility, projects, production

Reliably Reproducible Project Packages

Alex Kahn Gold

RStudio, United States of America

We all dread sharing a data science project with a collaborator or returning to a project only to find that it doesn't run because of mismatched package versions. Maintaining and sharing R projects is historically a fragile endeavor, relying mainly on crossed fingers.

There are now simple workflows to create and share an isolated R package environment for any project that makes luck irrelevant to the process.

In this talk, you'll learn to use the {renv} package to easily and quickly create isolated project environments, capture the packages in those environments, and share them with collaborators. Additionally, you'll learn how to take advantage of dated repository URLs from public RStudio Package Manager to make sure that you can add more packages and continue work on your project, no matter how far down the road that is.

2:25pm - 2:45pm
ID: 224 / ses-08-B: 3
Regular Talk
Topics: R in production
Keywords: deployment, DevOps, infrastructure, integration, R packages

Binary R Packages for Linux: Past, Present and Future

Iñaki Ucar1, Dirk Eddelbuettel2

1Universidad Carlos III de Madrid; 2University of Illinois at Urbana-Champaign

Pre-compiled binary packages provide a very convenient way of efficiently distributing software that has been adopted by most Linux package management systems. However, the heterogeneity of the Linux ecosystem, combined with the growing number of R extensions available, poses a scalability problem. As a result, efforts to bring binary R packages to Linux have been scattered, and lack a proper mechanism to fully integrate them with R’s package manager. This work reviews past and present of binary distribution for Linux, and presents a path forward by showcasing the ‘cran2copr’ project, an RPM-based proof-of-concept implementation of an automated scalable binary distribution system with the capability of building, maintaining and distributing thousands of packages, while providing a portable and extensible bridge to the system package manager. This not only benefits desktop/server users of Linux systems, but also Windows and macOS users that rely on CI/CD systems to test packages and/or deploy code.

2:45pm - 3:05pm
ID: 219 / ses-08-B: 4
Regular Talk
Topics: Community and Outreach
Keywords: community

R for Non-Programmers: Creating Paradigm Shifts in Reporting for Community-Facing Organizations Using R

Lisa Kulka, Sulagna Patra, Mohammad Haque

CCNY Inc.,

The automation of reporting processes for community-based organizations working with diverse communities has created a paradigm shift in the way they can approach quality improvement, allocate time and resources to data analyses and management, and utilize various kinds of data to support the communities with which they work.

At CCNY, a nonprofit organization that supports the evaluation and analytics work of community-facing organizations, the use of R to generate and enhance reporting schema has made a significant impact for those who have little to no programming experience. The most successful projects leading to increased organizational effectiveness via use of R with non-programmers include CCNY’s support of its local county’s Children’s System of Care. Non-automated data reporting and management created challenges in terms of credential tracking and ensuring children and families were receiving appropriate services. CCNY deployed R to pre-process and automate training data, generating reports that predict which community providers are eligible to render services. This new organizational ability created streamlined reporting processes that led to non-programmers running R and applying this newly-established data framework to automate other data-related tasks, including dramatically increasing community impact by reducing data processing time, making faster decisions informed by real-time data, and leveraging increased data processing capabilities to improve overall organizational capacity.

In this session, we will review specific features of the project’s unique code used to establish streamlined automated reporting for those with limited R proficiency, as well as the project input, the output deliverables, and techniques for engaging non-programmers in the logistics of building R schema so that basic principles of automated reporting can be understood and easily generalized within and outside of community-facing organizations.

Link to package or code repository.
Code will be shared in specific pieces as it contains sensitive information in certain parts, thank you for your understanding!