Max Schleicher, Interviewer
Max Schleicher is a graduate of Rice University. He works in digital marketing for Insureon, where he facilitates affiliate marketing partnerships, improves SEO performance, and occasionally participates in office bake-offs. Hossain Pezeshki earned a PhD in applied probability in electrical engineering from the University of Waterloo. He is now senior data scientist at Insureon, where he oversees forecasting, model building, and all things mathematical for the nation’s fastest-growing online insurance agency.
Long before a company rolls out a new advertising campaign, chances are a statistician designed a test, forecasted the project, and set up parameters to track its success. Hossain Pezeshki—an engineer, probabilist, and mathematician—explains how he found a home in web analytics and marketing and what students of statistics can do to build the necessary mathematical skills to be successful in marketing.
Let’s start with the basics. How can a marketing team use statistical analysis?
Admittedly, this is a broad question, but give us a little background on how you apply statistical methods in marketing and web analytics.
An easy place to start is forecasting. Forecasting is looking at historical trends and trying to project where certain trends are going to go. In marketing, you can come up with intervention models, which allow you to forecast what would happen if you don’t make any changes to your processes and compare it to forecasts that show what would happen if you did make those changes—or having made a change, whether your observation can quantify the impact of the change.
So when your company is gauging whether to implement a new marketing effort, they can run it by you?
This kind of work must be satisfying intellectually. Is it the intellectual challenge you enjoy most about being a statistician?
Actually, what I find most satisfying is not the intellectual challenge, but when I see that my work is assisting other people. That puts a feather in my cap. That makes me very happy.
For instance, the shortest project I ever had was a two-day project. A company wanted to tweak their telecommunications protocol. Basically, in a matter of microseconds, my model would give them results that would have taken weeks or months to collect on a real network. And here’s the interesting thing: When I projected the pictures on the whiteboard, the engineers’ instincts told them that, yes, the shapes of the curves I produced are exactly like what they would get. My work made intuitive sense to them. That’s what I find satisfying.
Obviously, when you’re working in web marketing, you’re not dealing with a closed system. When a company is doing so much of its business online, you’re in a shark tank with other web pages and all kinds of variables that are hard to quantify. How does that throw a monkey wrench into things?
Well, I can’t go into that level of detail because I have not solved the problem yet. Obviously, the more information you have, the more meaningful your results. But even when you’re surrounded by unknowns, I can tell you that just having the history of your own signal alone is extremely helpful. That allows you to do all kinds of things.
How did you first get into statistical analysis, and why did you choose it over other fields of mathematics?
Actually, my formal education was not really in statistics. My formal education was in the theory of probability. Now, of course, the theory of probability can be seen as the foundation of statistics, or you can think of statistics as the practical application of probability theory. Frankly, for years, every time I looked at statistics, I was repulsed by it. I found the way professors would talk about “normal distributions” troubling. My question was always, “How do you know that this thing should be normal?”
So what changed?
There was a very good book written by a probabilist on statistics—professor R.J. Serfling, a great American mathematician—called Approximation Theorems of Mathematical Statistics. That book bridged the gap for me.
In addition, there were the new challenges I was facing as I left grad school. When I started working full time, I was dealing with real data and I found that the more statistics I had, the more readily I could do my job. This time around, when I took some statistics courses, I found them very illuminating. With a little intellectual maturity under my belt, I had a much better appreciation for statistics.
You have worked in many areas of statistics, including quantitative cybersecurity and software development. How do the challenges of working in a marketing department compare?
Actually, marketing is a bit more straightforward. In the previous positions I had, there were no databases where I could go and get data for my analysis. Previously, I had to construct models, and from these simulations, I would get data to analyze. So, in a sense, marketing is actually quite a bit easier because it’s all post analysis, or almost all post analysis. Marketing, by its nature, generates a huge amount of data, which is great for a statistician.
Let’s talk about that data. When you’re working in web analytics, you’re often served a huge pile of data about sales, traffic, and user behavior. How do you go about making sense of that data?
It’s no different than any other project. There are a number of signals. There are a number of processes, and you’re trying to find the relation between them. Again, coming from a probabilist background, I see all regression models as various ways of approximating conditional expectations, and I see conditional expectations as the alpha and omega of all estimation. So basically I don’t see numerous problems; I see one problem with many facets.
Part of the challenge of being a statistician is translating your theoretical and practical understanding of a problem to a “layperson,” who may not have a background in mathematics. How do you approach this problem?
I understand things by example, so I communicate things by example. I keep in mind something Albert Einstein said—“When you start solving a problem, try to keep the problem as simple as possible, but no simpler”—which means you want to identify the primary salient features and explain them. To do that, use examples. Other than that, I don’t know how to explain it.
I know you a bit, Hossain, so I’m going to guess your ability to communicate well with your coworkers may have something to do with the fact that you bring chocolate to the office. Is there a correlation?
Yes, I do that, too. But I’ll tell you this. I was preparing for my PhD exam, and you’re supposed to write a document that’s more or less the precursor to your thesis. The first version I wrote, my supervisor did me the great service of throwing it right back in my face. Even though I was communicating something very technical to a highly technical audience, I had botched it. That was a lesson in humility. That told me something about communicating. I had no choice but to learn how to communicate my points clearly.
What advice do you have for students wanting to pursue a career in statistics?
I can tell them this: Don’t dismiss the theoretical background. Not just for statistics and marketing, but for any applied science. The biggest mistake a student can make is to dismiss the theoretical part of their curriculum. Don’t treat it as an unpleasantness that you have to get over with. The theoretical part is what will last.
My advice would be to strengthen your theoretical background. If you have the time, take as much mathematics as you can, even if it’s not in probability and statistics. Study it and internalize it.
Specifically, what classes do you recommend for them?
Linear algebra. Numerical methods. Of course, probability theory, my favorite. And calculus. Then, if you can find a good course on mathematical statistics, take it. Not just the first and second courses of statistics, but the foundational stuff.
What is the difference between mathematical statistics and the first couple of statistics classes a student might take?
Mathematical statistics emphasizes the foundations. For instance, why is it that the maximum likelihood estimation works at all? You can write the likelihood and differentiate and set the score function to zero and calculate, etc., but why is it that this works? Why is it that the central limit theorem works at all?
It’s seldom that you can go to one of your textbooks and find a formula that’s directly applicable. You have to manipulate it. If you don’t have the theoretical background, that step becomes very difficult for you.