University of Michigan’s Undergraduate Big Data Summer Institute Completes Fifth Year

Matthew Zawistowski, University of Michigan Clinical Assistant Professor of Biostatistics and Big Data Summer Institute Assistant Director

University of Michigan Big Data Undergrads

The Big Data Summer Institute (BDSI)—a joint effort between the departments of biostatistics, statistics, and computer science at the University of Michigan (UM)—recently completed its fifth year, sending its 2019 cohort of 40 undergraduate students into the brave new world of biostatistics and data science.

BDSI was established in 2015 by the University of Michigan School of Public Health Department of Biostatistics to train the next generation of quantitative scientists by immersing them in cutting-edge research projects at the interface of statistics, computing, and health sciences. Led by founding director Bhramar Mukherjee, the program has been a success. To date, BDSI has trained 204 undergraduates, many of whom are now enrolled in elite graduate programs in biostatistics and related quantitative fields. Further, the cohorts have reflected the need to increase diversity in data science; more than half of BDSI participants are females and approximately 17 percent are from under-represented minority groups.

The program was initially funded through a grant from the National Institutes of Health’s (NIH) Big Data to Knowledge (BD2K) training initiative and, beginning in 2019, became a National Heart, Lung, and Blood Institute Summer Institute in Biostatistics program. In addition, the program has received generous contributions from both within the university and external sponsors such as the Trehan Foundation and Flatiron Inc.

From its inception, BDSI has employed a successful model of T-shaped learning combined with enthusiastic participation from faculty across campus. Students meet for daily morning lectures that provide a broad overview of key concepts and applications in the world of big data. The first couple of weeks establish basic skills in statistics and computing to jumpstart work on research projects. As the program proceeds, the morning lectures turn toward advanced statistical and machine learning topics well outside a standard undergraduate curriculum. In the final weeks, students hear about big data applications from a different perspective, that of nonstatisticians. Clinicians, epidemiologists, and sociologists give examples of how they use big data and provide insights into their collaborations with quantitative scientists.

Each afternoon, students split into smaller groups for a deep-dive into faculty-mentored research. Each group consists of 10–12 BDSI students focusing on a specific research domain under the direction of 2–3 faculty mentors and 1–2 graduate students. The 2019 cohort explored projects in genomics, machine learning, and data mining, while previous topics included medical imaging and electronic health records. The afternoon sessions allow students to immediately apply concepts introduced during morning lectures to real-world data and discuss results and next steps with their faculty mentors. Many times, the projects are designed with an open-ended aspect to allow students to incorporate their individual curiosities and interests.

Students showcase their research at the concluding symposium. Dressed in fine attire, they take center stage to deliver talks to Michigan faculty and graduate students. This year’s symposium included a keynote by Blake McShane, associate professor of marketing at Northwestern University’s Kellogg School of Management, in which he challenged the BDSI students to reconsider the interpretation of p-values when assessing statistical significance. In previous years, Rachel Schutt, co-head of the AI Lab in BlackRock and coauthor of Doing Data Science, has given presentations.

Students learn more than just statistics during their six-week stay in Ann Arbor. They form a close-knit network, bonded over late-night study sessions and social activities like canoeing down the Huron River, a road trip to the Detroit Museum of Art and, of course, a guided tour of the Big House. They listen to a special weekly seminar series of “Journey Lecture” by scientists across their career spectrum as multiethnic cuisine is served for lunch. The students also receive resources for professional development and coaching to prepare for graduate school.

This network of friendship, mentoring, and support persists well beyond the summer. Thanks to social media, students remain close as they complete their undergraduate educations and enter the next phases of their lives. Many students even reunite in graduate school, as is the case for 17 BDSI alums currently enrolled in the department of biostatistics at the University of Michigan.

Applications are being accepted for the 2020 Big Data Summer Institute at the University of Michigan.