Consortium Introduces Training on Applied Data Analytics for Public Policy


rayid-ghani_univconsRayid Ghani is director of the Center for Data Science and Public Policy, research director at the Computation Institute, and a senior fellow at the Harris School of Public Policy at The University of Chicago.



frauke-kreuter_univconsFrauke Kreuter is a German sociologist and statistician who works as a professor and director of the Joint Program in Survey Methodology of the University of Maryland, College Park.



julia-lane_univsconsJulia Lane is a senior managing economist at the American Institutes for Research; a professor of economics at BETA University of Strasbourg CNRS, Chercheur, Observatoire des Sciences et des Techniques, Paris; and a professor at Melbourne Institute of Applied Economics and Social Research, University of Melbourne.


The University of Chicago, New York University, and University of Maryland are launching a short-term, five-session program targeted at government agency staff and policy evaluators and researchers. It is facilitated by Frauke Kreuter, Rayid Ghani, and Julia Lane—three of the editors of the new textbook Big Data and Social Science: A Practical Guide to Methods and Tools.

The program is designed to respond to the burgeoning interest in joining and using data sets across federal, state, and local agencies to enhance decision making. It features hands-on and collaborative work using agency microdata and teaches participants how to approach real social policy problems using real-world data and modern computational data analysis methods and tools. It also offers a unique opportunity to learn alongside and network with practitioners from other cities and states.

Participants learn how to scrape the web; use APIs; manage complex data; apply machine learning, text, and network analysis; and think about inference issues, privacy, and confidentiality.

The initial three cohorts will connect data on different groups of policy interest (i.e., ex-offenders, welfare recipients, and veterans) with their access to jobs. It does this by connecting the characteristics of the residence, public transportation options, and job availability and then examining outcomes of interest such as earnings, employment, recidivism, or return to welfare recipiency.


The classes will be structured around these linked data sets, which can be modified and expanded by class participants according to their interests. The data infrastructure will make use of new technology (JupyterHub). Specific examples and code will be provided in advance through the notebooks and companion book, so participants can work with developed code and have direct, replicable, and high‐value interaction with the data and each other. Based on earlier experience, the result will also be that networks will be formed, new data assets will be created, and useful reports and analyses will be generated.

Facilitators are happy to work with managers of different agencies to identify additional evaluation topics and data sets of interest.

Scholarships are available for government agency staff. Applications can be requested from

The program builds on a successful set of pilot classes at the federal level that resulted in the establishment of a new initiative at the U.S. Census Bureau (the Innovation Measurement Initiative), as well as high-quality research from the Census Bureau staff participating in the activity (a research article in Science and one in the American Economic Review.

For information, visit the Applied Data Analytics website or email