CU Boulder researcher Eric Vance recently won the W.J. Dixon Award for Excellence in Statistical Consulting, in recognition of his work to help statisticians and data scientists become better communicators
The skills of statistics and data science are broad and varied, requiring those who use them not only to ask the right questions and capture the right data, but to process and analyze it and then convey what they discovered.
Students of statistics and data science are taught methods and modeling, they’re taught to code and to troubleshoot, “but how do we teach students in statistics and data science to become more effective collaborators?” asks Eric Vance, a University of Colorado Boulder associate professor of applied mathematics.
“The thing about modern statistics is that almost anybody can upload an Excel spreadsheet to a statistical software program, do some stuff and get answers. You can have people who understand data, who understand methods and the appropriate conditions to use those methods. But what we want is to grow the number of well-trained data scientists who understand that the context of data matters and who also have that drive to see their work put into action for the benefit of society and know how to collaborate to make that happen.”
For most of his career, Vance has recognized that it’s not enough to be good at statistics and data science—students entering these fields must also learn communication and project-management skills to become effective collaborators. He has designed curricula and academic programs that promote this goal, work that recently was recognized with the American Statistical Association’s W.J. Dixon Award for Excellence in Statistical Consulting.
The award recognizes individuals who have “demonstrated excellence in statistical consulting or developed and contributed new methods, software or ways of thinking that improve statistical practice in general.”
As the youngest winner by at least 15 years, Vance is in the middle rather than at the close of his career, “which is good because there’s still a lot I want to do to translate my framework for collaboration into different languages and cultures, and to build it up across disciplines.”
Doing good with data
Since the beginning of Vance’s academic career, which started as director of the Laboratory for Interdisciplinary Statistical Analysis at Virginia Tech, “I noticed that my students were really good in statistical methods, but only some of them were really good in the non-technical skills, the communication skills,” he says.
“Part of my job was also to teach statistical consulting, so I started to think about what are the key aspects that a student needs to know, that a student can learn to become an effective, collaborative statistician?”
Good data scientists have a deep store of quantitative skills, he says, and many enter the field because they want to work with real data and pursue projects that help society and benefit humanity. Plus, in this hyper-plugged-in world, data are everywhere—powerful data in huge datasets with the potential to have sweeping effects. The demand for people who can analyze data properly and leverage them appropriately is growing.
“But what I noticed is kind of holding statisticians and scientists back is not technical skills—it’s not that they don’t know the latest analysis technique—but it’s that they don’t have the communication skills,” Vance says. “That became my focus: What is it that a student or a data scientist needs to know to effectively unlock the technical skills to do the most good?”
At CU Boulder, Vance established and directs the Laboratory for Interdisciplinary Statistical Analysis (LISA), housed in the Department of Applied Mathematics, to teach students “to become effective interdisciplinary collaborators who can apply statistical analysis and data science to enable and accelerate research on campus and making data-driven business decisions and policy interventions in the community.”
Vance explains that often statisticians and data scientists are not the ones collecting the data they analyze, so “if we want to develop new methods, we need to have data, and who has data? Everybody else. Domain experts are everywhere around world, so statistics and data science should be collaborative disciplines, and students should learn to work with a chemist or a biologist or an English professor or an elected official to help them think about what kind of data they have, help them collect high-quality data and transform into policy and action.”
More than just good with data
Vance and his colleagues have built LISA into the center of the global LISA 2020 Global Network of statics labs that aim to strengthen local capacity in statistical analysis and data science and to transform academic evidence into action for development.
You can’t just be good with data anymore; you have to be able to communicate why it matters.”
The LISA 2020 Global Network comprises 35 statistics labs in 10 countries, including Nigeria, Brazil and Pakistan. Vance is now a Fulbright fellow in Indonesia, where he’s working with colleagues at IPB University to develop a course in effective statistics and data science collaboration and establish a new statistics and data science collaboration center.
Several years ago, Vance and research colleague Heather Smith developed the ASCCR framework—which stands for attitude, structure, content, communication and relationship—to support this model of statistics and data science education that incorporates collaboration skills. Vance’s work in Indonesia is also exploring how to adapt ASCCR within different cultural contexts.
“We want statistics and data science students around the world to have the skills to collaborate and communicate with domain experts,” Vance says. “Maybe it’s a researcher around campus, maybe a local policy maker, maybe a local businessperson—anybody who has data and wants to be able to do something with the data, make a decision based on the data or come to some conclusion.
“We want students to become people who can talk with a domain expert to understand what the problem is, what the data are, how they were collected, the provenance of the data, and then figure out what that the domain expert actually wants to do with the data. That means understanding the workflow of collaboration before actually analyzing the data and coming up with some statistical results. Then they need to translate those results to answer the original research question or come up with a conclusion and recommendations for action. You can’t just be good with data anymore; you have to be able to communicate why it matters.”