Guiding Internal Collaborators Through Statistical Training

George Rodriguez is a chief scientist, who earned his master’s in statistics after completing a doctorate in chemistry.

Min Chen is a lead statistician. She earned a PhD in statistics after completing her master’s in oceanography at Texas A&M University.

Statistical consulting is a major path of employment for ASA members, so considering improvements in the way we engage others should be an ongoing process. This approach is particularly useful when the statistician is part of a disproportionately small cohort of data professionals within an institution.

An argument against encouraging collaborators to develop their statistical knowledge is that we could minimize our future impact (i.e., working ourselves out of a job). A second concern is the empowerment from learning ‘some’ statistics leads to incorrect usage, which leaves the data professionals with the responsibility of cleaning up a mess in postmortem analyses. Although the frequency one encounters these two arguments may assuage our inclination to encourage others to develop statistical literacy, a slight shift in vantage point affords a fresh perspective.

Continuously encouraging others to expand their statistical skills leads them to think of you as a technical leader and increases your influence throughout the organization, since most colleagues will ask for your guidance in this endeavor. Don’t send them to sort through endless resources. Embrace the spotlight and steer colleagues through the overwhelming statistical training landscape.

In addition to developing your leadership brand, getting collaborators trained in statistical methods has other benefits. As collaborators and internal customers grow their expertise to address simple and standard statistical applications, you will have more time to explore novel areas of applications throughout your institution and continue developing your own skills. These additional competencies can be marketed as organically gained capabilities.

Your long-term relevance to the organization comes from highlighting how markets change rapidly, there is a need to keep up with new methods, and the evolution of adjuvant technologies will require highly trained data professionals.

Although you are not responsible for the entire learning journey, you should certainly recommend the first three steps: (1) know where to begin; (2) identify the most useful advanced topics; and (3) develop statistical thought processes.

Being specific is extremely helpful when guiding nonspecialists with the first step toward basic statistical literacy. It’s ineffective to suggest relearning the basics, and then picking up advanced methods as needed. Understand individual backgrounds and needs so you can recommend the best options to facilitate every unique journey and help colleagues see connections to their work. High specificity regarding what to learn and good alignment with concurrent work is particularly important to adult learners who believe their training time should be highly focused on the most relevant topics.

As you engage individuals, assess where each person is on their learning path. For example, our organization consists of PhD engineers and scientists with good quantitative backgrounds that typically include courses in basic statistics. This type of technical foundation suggests a little review is sufficient for our colleagues to jump into deeper waters. Yet, we are prepared with a few options for those lacking even the most basic exposition to the field. Recommendations for this group include short courses from vendors or professional societies (American Statistical Association, Royal Statistical Society), MOOCs, or bone fide introductory online university courses. Be careful with terminology when assessing an individual’s background as some colleagues may claim they understand linear models because they know how to fit a straight line to a scatter plot.

Remember that a little thoughtfulness goes a long way when engaging collaborators who come from a variety of backgrounds. Suggestions can be found in the Amstat News article titled “The Right Tone Sustains Productive Dialogue.”

Assist your colleagues as they progress to the second stage of their statistical training journey. Creating a standard curriculum everybody follows is the easiest approach, yet it leads down the wrong path, where learners spend time on methods not relevant to their work, become frustrated, and likely disengage entirely.

Accordingly, selecting the correct methods also requires you to understand organizational needs. We recommend design of experiments to deal with high-throughput experimentation to maximize the utility of complex experiments. Mixture designs to optimize formulations are recommended for those developing lubricants and polymer products. This community also benefits significantly from DOE to compare product performance and make marketing claims.

Our process engineers rarely have the luxury of using DOE in expensive manufacturing environments. However, they are flooded with time-stamped data, which naturally elicits the need for time series methods. They also need to improve process performance and determine if products meet specifications. This scenario calls for process capability analysis and statistical quality control.

Analytical chemists characterizing materials find multivariate analysis most useful. They appreciate MVA to such an extent that they have developed the moniker chemometrics to describe the application of statistics in their analytical work. In fact, our analytical chemists are excellent allies, who help proselytize the value of learning statistics for industrial applications.

Advances in user-friendly software greatly facilitate statistical analyses. However, these advances make it easy to misuse methods due to not understanding the fundamentals, including variation, randomness, and sampling bias. These concepts are critical components of statistical thinking and must be developed.

It is necessary to define research objectives, understand processes to design meaningful data-collection schemes, critically select statistical methodology, assess whether the underlying assumptions are valid, and make inferences from data relative to the objectives. Finally, it is important to communicate results in a way that highlights the central role statistical thinking plays in the entire process.

Missteps should be expected as proper statistical thinking and usage becomes well established. It is therefore important to nurture a culture in which learners are comfortable discussing their work in a judgement-free environment. Initiate study groups in which everyone agrees to behavior and communication norms.

Give participants ownership of their learning. For example, offer a short list of common statistical mistakes and ask them to grow the list. Recommend appropriately accessible books and papers describing these mistakes so learning becomes self-paced. Organize a statistical advisory board consisting of those who are further in their training journey to provide objective advice.

Once again, understanding the organization allows you to recommend the best guardrails to keep collaborators on the right path and keeps you from reworking incorrect analyses.