Iyue Sung is vice president of enterprise analytics at Press Ganey, a health care data, consulting, and technology company. He manages a team of data scientists who uncover patterns related to patient experience, workforce engagement, nursing quality, and clinical quality. Sung’s prior experience includes electronic health records company athenahealth and Oliver Wyman, a strategy consulting firm. He holds a PhD in statistics from The Ohio State University and a BA in mathematics and philosophy from Boston University. You can reach him by email or Twitter (@iyuesung).
In Chinese, the literal translation for cooking, “zhŭ fàn,” is “cook rice.” What does this have to do with data science? It’s a useful metaphor for working as an applied statistician (i.e., data scientist). To explain, let’s take the cooking analogy further and compare ourselves to chefs.
A chef uses both technical skills (e.g., knife skills, cooking technique) and subject knowledge (the chemistry and nuance of how ingredients interplay) to develop and cook a dish. Furthermore, they don’t think about just the individual dish, but the whole meal. You go to a restaurant not to eat, but to experience.
Similarly, a data scientist uses both technical skills (programming to create data visualizations and build models) and subject expertise (the math behind statistical methods) to turn data into knowledge. But your job isn’t to just build a model or create a cool data visualization. It’s to help someone make data-driven decisions. The analysis is one component of the whole. It provides supporting evidence for the larger objective of understanding the client’s problem and providing practical solutions. Put another way, don’t be the colleague who responds to a request with a table of numbers or data chart. First ask what problem your colleague is trying to solve.
What does it mean to solve a problem? I’ll use my current experience to explain. My company collects—for our clients—data that reflects how well health care is delivered in hospitals, medical offices, etc. We measure aspects such as patient experience (perception of different facets of their stay), employee engagement (perception of a hospital’s culture, processes, and quality), and clinical outcomes (e.g., infection rates, readmission rates). Clients access this data and related analytics through a web application built by our development team.
The obvious purpose of this information is to improve performance (e.g., reduce infection rates or decrease nurse turnover). And one of the data science team’s function is to help our clients improve their performance, beyond what they can do with the application. A common problem might be, “I need to improve the patient-clinician relationship. What components of care delivery impede that relationship?”
Health care is obviously complicated, so there are many aspects involved with answering this question. One needs to understand how a hospital functions, how clients make decisions (they want good, not perfect, solutions), how data is structured in the databases, statistics, programming, and process. The last two are important because we try to scale all the work we do (i.e., develop a process in which we can answer similar questions—for all clients—without re-inventing the wheel). The chef wants a kitchen running like an efficient assembly line to turn out quality meals consistently while accommodating special requests.
Putting together the parts above requires collaboration with a range of people—clients, client liaisons, health care experts, and database architects (to understand the database’s structure). You may also need to consult your data science colleague who wrote an R (or Python) package that can be used for your problem. In other words, a project like this requires taking a broad perspective of what it means to be a data scientist.
To understand this perspective and how our team solves these problems, I find it helpful to group our responsibilities into the following four categories:
- Technical: Programming and statistical analysis tasks
- Process: Organized systems to get things done
- Industry: Understanding the big picture of what you’re doing
- People: Interpersonal skills to get things done
Within each category, a data scientist’s expertise and responsibility grow—or shrink—as their career progresses (represented in the diagram). Initially, they spend much of the day producing analytic work and less time interacting with colleagues outside the team (but you should certainly ask). In contrast, this ratio flips for group leaders. They spend less time on technical matters and more on nontechnical tasks to find more “business” for the team, which, consequently, provides them more opportunities to grow.
Going back to the food analogy, think of the data science team as the restaurant staff. A restaurant requires infrastructure, processes, and a range of positions to run effectively. Together, the chef, manager, cooks, suppliers, and service staff enable all patrons to enjoy the experience consistently.
Your data science team should have the same components: infrastructure (e.g., database and software like R); processes (e.g., your own R packages and a version control system); and a wide range of expertise (programming, data engineering, statistics, industry knowledge). You cannot be an expert at everything, but the team can. If your team is well run, you can focus on your project, which is the “meal” of analysis, presentation, and decision-making. Through this collaborative experience, you and your colleagues will learn from each other.
What I’ve described isn’t original and can be applied to most fields. But given you’re an applied statistician—someone who savors discovery and professional growth—you’ll surely find joy in thinking more broadly about what it means to “do data science.” Once you’ve mastered cooking just rice (risotto ain’t easy), there is much to learn about the many ways it can complete a meal and, ultimately, the entire process of providing a consistent dining experience. Enjoy the experience!
Nice article, several modulon articulated openly, with unbiased fashion.