The Scoop on Data Journalism

A light-skinned woman with light eyes and brown hair wearing a scarf and large earringsNaila Moreira is the science writing specialist in the Jacobson Writing Center and directs the journalism concentration at Smith College. She has published science journalism, nature writing, fiction, and poetry in venues including The Boston Globe, Daily Hampshire Gazette,, and Cider Press Review. She has poetry forthcoming in Scientific American and a middle-grade novel from Walker Books US in the spring.

A white man with short brown hair and light eyes smiles at the cameraBenjamin S. Baumer is a professor in the statistical and data sciences program at Smith College. He has been a practicing data scientist since 2004, when he became the first full-time statistical analyst for the New York Mets. He won the Waller Education Award from the ASA Section on Statistics and Data Science Education and the Significant Contributor Award from the ASA Section on Statistics in Sports in 2019.

Data journalism is the practice of telling stories with data. Writer and journalist Naila Moreira and data scientist Ben Baumer teach a hands-on data journalism course at Smith College that focuses on journalistic practices, using data as a source, and interpreting results in context. We wanted to know more about Moreira, Baumer, their course, and data journalism, so we asked the following questions:

Naila, what or who inspired you to be a journalist/writer?

I got started in journalism while working toward my PhD in geoscience at the University of Michigan. I became intrigued by science and the public interest right about the same time I felt a hankering to return to my longtime love of writing.

I pursued a summer science policy fellowship at the National Academy of Sciences in Washington, DC, where I was given the chance to write news articles about breaking scientific findings for the National Academies website. I loved it.

On the advice of Justin Gillis, then a reporter at The Washington Post and later The New York Times, I returned to my graduate program, joined the reporting staff of the Michigan Daily as their first research beat reporter, and—following Gillis’ exhortation—“did not leave.” The clips I wrote for that student paper enabled me to apply for journalism jobs.

Ben, what or who inspired you to be a data scientist?

I don’t think any one person inspired me to become a data scientist. For starters, “data scientist” wasn’t a thing when I was growing up!

I think I’m a data scientist because it provides an approach to solving problems that makes sense to me. I’ve always been the kind of person who finds data to be a compelling vehicle for becoming more informed, and I enjoy learning the various technical skills used in data science (e.g., mathematics, statistics, computer science, etc.).

I also enjoy writing, and that has helped tremendously in making this course successful.

Why do you think it’s important to teach data journalism?

Ben: Journalism is our best tool for keeping the public informed, which, as we’ve seen, plays a vital role in a healthy democracy. There are many jobs in data science that basically amount to figuring out better ways to encourage people to click on links, and we have many students at Smith who want to use their data science skills to do something that will have a more personally meaningful impact upon our culture and society. Data journalism offers a path to do just that.

Naila: Encouraging writers to bridge the gap between the ‘humanistic’ and ‘scientific’ fields is a huge interest of mine. Science is human, and our humanity can grow through science.

Data journalism allows us to turn the analytical lens of data science on people’s daily lives, needs, hopes, and future. From there, the journalist’s writing and reporting techniques can help bring that knowledge and understanding directly to the broader public that needs it most.

Briefly describe the curriculum for your course.

Ben: We read and write!

First, I want to acknowledge that the first version of this course was developed and taught by Amelia McNamara (now of St. Thomas), and we borrowed heavily from what Amelia created.

In a nutshell, Naila and I have different skills and training we bring to the course, but we’re trying to move students into the same place—a place where they can write a high-quality piece in a journalistic style informed by data.

Students learn a combination of skills and improve their abilities with research, writing, data wrangling, and data visualization. We read current articles in various publication venues and workshop their writing in multiple ways. In total, students write three pieces of data journalism, and some of them get published!

Naila: We have a great teamwork relationship. I rely enormously on Ben’s skill in discovering, parsing, and making sense of data sets. My own skills tend to shine most in the direct reporting, writing, and structural design aspects of journalism. Together, he and I discuss newsworthy themes that might reward a data approach for examples or project work for students.

In the classroom, as Ben noted, we start by teaching basic skills that underpin data journalism via shorter written assignments. We then help students design a larger journalistic team project within their own interests.

I try to treat students as if they are already professionals and, indeed, we’ve had student pieces published in our local paper, the Daily Hampshire Gazette.

Do your students pitch stories to you, or do you give them topics to write about?

Ben: Yes, they pitch ideas to us, and we use our collective experience to help them craft viable articles from their pitches. That process has actually been interesting in its own right, because Naila and I have different hunches about what might make a story successful or unsuccessful. Sometimes, the feedback is like, “That’s a great idea, but where are you going to get the data?” Other times, it’s more like, “That’s a cool data set, but what is the story?”

Ultimately, the stories they write come from a student-generated, instructor-approved curated list.

Naila: We do a lot of scaffolding work to guide students toward newsworthy topics, always aiming to clarify exactly what “newsworthy” means—having a timely news peg, an audience, an impact, a doable scope, etc. We also push them to find and explore data sets from sources they may not have known about or thought of before.

However, the thrust of their articles always comes from them. The students surprise me all the time—they bring so much to the table in terms of their own interests and ideas. From their existing knowledge, we work to hone their approach to make it manageable, journalistic, readable, and informative.

Can an individual be both a writer and a data scientist, or do the two generally collaborate?

Ben: I think yes. Obviously, different people have different sets of skills and proclivities, and there are students in the course who are clearly coming to us from one side or the other. But ultimately, we want all students in the course to have some capacity for doing both kinds of work.

For many of the students coming from data science, this is the first college course that has forced them to think intentionally about their writing style, and that experience is often deeply rewarding for them.

Naila: A lot of professional news outlets split duties between people who are officially either data experts or writers. Those two types of journalists collaborate to create the final article. However, I think when each side fully understands the tools of the other, the result comes out stronger. Anyway, nothing stops anyone from either writing well or wrangling data well. Both are just tools, and tools can be mastered.

What is the difference between a journalist and a data journalist?

Ben: Not all journalism is informed by data. When you feel like the data is really pushing the piece forward and is central to what is being discussed, then you’ve got data journalism.

Naila: Ben’s right about data being the core of the data journalism piece. Also, though, data journalists may source their ideas and topics differently. They extract newsworthy ideas from data, rather than from stuff people tell them or that they find in written documents.

Ben and I have often talked about ‘interviewing’ data much like interviewing a human source. In today’s big-data world with gobs of quantitative information available or extractable online, data journalists have a lot of novel opportunities.

Should all journalists be data journalists?

Ben: No, there is plenty of room for non-data journalism.

Naila: Agreed! But all journalists could benefit from some data training.

How does data journalism combat misinformation?

Ben: Misinformation is tough because, if your goal is to misinform, you could make up data and pass it off as data journalism. But I think we’ve already seen changes in political and sports journalism toward more data-driven storytelling. Compared to 10 years ago, a story about the NBA Finals today, for example, is much more likely to focus on which team is scoring with greater efficiency than which team’s star player ‘wants it more.’ In politics, despite all the challenges pollsters are facing, the narratives can’t ignore polling to the degree they could in the past.

Naila: Understanding data and how to write about it can help journalists spot and fight misinformation.
I don’t think data journalism is necessarily less subject to bias or abuse—there are a lot of ways to bend, misrepresent, or falsify data. However, when handled well, data can give reporters and readers an important gut check.

Like, how many people were at Trump’s inauguration? Well, we can argue about that without much back up or we can report on time-tested methods of crowd estimation, what their limitations and advantages might be, and what they tell us.

What advice would you give to aspiring journalists when it comes to working with data?

Ben: Brooke Williams of Boston University came to our class last year and taught us a great lesson about what events make records. Of course, every time someone gets arrested, there is a record of that, but less obviously, every time someone pays a municipal water bill, there is a record of that. And many of those records are public, so you can get them just by asking (i.e., making a public records request under the Freedom of Information Act).

The best data journalism comes from using a novel source of information to address a question that may not be so obviously related to that data.

Naila: Learn—and ask data experts—before you jump off the deep end and write! I think a temptation with data journalism is to grab a data set and use basic analysis to draw newsworthy conclusions. However, data can hide a lot of assumptions, biases, and blind spots. Is your data set complete? Is it correct? To avoid misrepresentation, it’s important to thoroughly think through your data wrangling before you finalize your conclusions.

What is the best way for a data journalist to get a job or have their work published?

Ben: There are lots of online outlets, but I’ve been impressed with what our students have been able to accomplish at our local paper, the Daily Hampshire Gazette. It’s a real, longstanding, local newspaper and, like many of its kind, it is feeling the pinch economically. They’ve been grateful to have high-quality content submitted. For the students, it’s a hard byline they can take with them wherever they go, and they know they’ve helped strengthen local media in the process.

Naila: I took a nontraditional path into journalism, so I know it pays to be persistent and scrappy. Finding any opportunity to write and publish is the crucial first step to a career. That often means starting small. Write articles for your school newspaper. Write freelance pieces or op-eds for a local paper. Write for public-facing or general-interest sections of a trade publication or newsletter in your area of knowledge. Read about how to pitch effectively. Those examples of your writing—your “clips” as they’re called in news parlance—are the stepping stones to bigger publication venues and professional jobs.

What books do you recommend aspiring data journalists read?

Ben: I really enjoyed reading Dear Data by Giorgia Lupi and Stefanie Posavec because it illustrates how artifacts from your everyday life—which you might not recognize as data—can be used to create compelling data visualizations. These, in turn, could drive stories.

Communicating with Data by Deborah Nolan and Sara Stoudt is wonderful and covers all the bases.

Naila: To complement Ben’s data journalism texts, I’ll plug some broader texts on good writing and journalism.

William Zinsser’s On Writing Well helps any writer improve their nonfiction clarity. Telling True Stories, edited by Mark Kramer and Wendy Call, covers a lot of bases when it comes to designing, reporting for, and writing good journalism. And The New Ethics of Journalism: Principles for the 21st Century, edited by Kelly McBride and Tom Rosenstiel, is crucial reading, as it covers the changing role and responsibilities of journalism in the digital age—the stage from which data journalism as we know it today operates and relates to the world.