Virtual Workshops on Blended Data a Success

Elizabeth Mannshardt and Jenny Thompson

Workshop Contributors

Aric Labarr

Aric Labarr


Hunter Glanz

Hunter Glanz


Cynthia Rudin

Cynthia Rudin


Frauke Kreuter

Frauke Kreuter


Matthew Graham

Matthew Graham

Trent Buskirk

Trent Buskirk

The Government Statistics Section (GSS) and Social Statistics Section (SSS) hosted a series of virtual workshops as part of the ASA’s professional development program. This free series targeted audiences who may not be able to travel to conferences but are interested in continuing education opportunities. All webinar materials and videos are available on the GSS professional development and mentoring website.

Each virtual workshop consisted of a one-hour presentation followed by virtual participation in a group discussion and activities using data and code provided by the presenter. The six sessions had between 75 and 120 attendees each.

Due to the success of the workshops, GSS and SSS will continue the series in the fall of 2020 via a virtual workshop practicum including a student showcase, again hosted as part of the ASA’s professional development program. Practicum sessions will focus on user applications—both completed and in progress—from methodologists ranging from current students to seasoned professionals. Students and professionals are invited to submit their projects putting blended data techniques into practice for the opportunity to present in the upcoming virtual workshop practicum.

The original virtual workshop series exposed participants to the advantages of using combined data sources for developing inferential models and measures, while remaining cognizant of the challenges associated with combining large data sets and the potential pitfalls of analyses of blended data, including privacy considerations. Topics covered included the following:

  • Overview of Blended Data (Frauke Kreuter, University of Maryland)
  • Intro to Big Data and ML for Survey Researchers (Trent Buskirk, Bowling Green State University)
  • How Rare Is Rare? The Importance of Validation (Aric LaBarr, North Carolina State University)
  • Intro to Python for Data Science (Hunter Glanz, California Polytechnic State University)
  • Interpretability vs. Explainability in ML for High Stakes Decisions (Cynthia Rudin, Duke University)
  • Differential Privacy (Matthew Graham, US Census Bureau)

After the final workshop in the series, participants were asked to complete a survey. Although the response pool was small (32 participants), feedback was positive and participants provided useful suggestions for topics and logistics. Overall, the workshop earned a 4.4/5 rating.

The evaluation provided evidence of the relevance of the suite of selected topics, with the majority of respondents agreeing “the concepts presented will inform my practice” (25/32) and “the tools highlighted will be useful in my practice” (25/32).

The benefits of virtual presentations to a broad audience were underscored, with the workshop’s most valuable aspects being “community engagement” and “being able to learn without travel.” One participant commented, “Not being based in North America, it gave me the opportunity to hear from experts I would not get to hear.”

Suggestions for going forward included an open discussion board among attendees to be used during and after the webinar, additional readings and examples for further study and practice, and deeper-dive, multi-part tutorials on certain topics. Several participants also expressed interest in more sessions on various topics.