If you get nothing else out of this post, please read Jake VanderPlas’ great write up on this topic, “The Big Data Brain Drain: Why Science is in Trouble”, and his follow-up, “Hacking Academia: Data Science and the University”. Even though they are from 2013 and 2014, they are still extremely valid, and I fear will continue to be so for quite some time.
Jake is by no means the lone voice on this topic. In fact there is a growing chorus, not to mention an entire organization in the UK, devoted to this topic. I was just at a meeting hosted by the Computing Community Consortium on the topic of “Computing Research: Addressing National Priorities and Societal Needs”, where a question was asked about how we should support infrastructure or other key science software. My answer centered not so much around funding models, or technology stacks, but rather to start focusing on career paths for the researchers among us who build these essential products and services. You might call these people data-driven researchers (we do) or data scientists, and they are having a difficult time inside academic research right now.
In speaking about this over the last few years, I have noticed that researchers who do this kind of data-driven work are fairly consistent in what they need:
Intellectual Freedom
Data Scientists/Data-Driven Researchers are highly trained and educated scientists, at least in the academic research context, and should be afforded the freedom to choose projects/research they work on. Know when to hire an independent researcher, and when to hire a software engineer. Both are important, and their roles differ in the ability to choose projects.
Respect / Stability
Data Scientists/Data-Driven Researchers, who are not tenure-track faculty, need a path within academia that is well respected and stable (i.e., not just soft money positions). This is a hard nut to crack, as there are very few examples of how to do this well. I like (clearly biased on my part) what is emerging at places like Berkeley, UW and NYU with respect to their Computational Fellows, Data Scientist (Research Staff) and Research Engineer roles. If you can’t compete on salaries (and you can’t), you at least need to get this right.
A Diverse Community
Data Scientists/Data-Driven Researchers should not sit alone in the basement, the sole practitioner in a group. This is an emerging area of study that is rapidly evolving, and will thrive with a variety of perspectives including different disciplines, genders, cultural and educational backgrounds. The more we share what is working and not working in this space, the faster we will advance.
One thing you might notice about these items is that they are essentially the attributes that attract people to academia in the first place. Making data-driven researchers an integral part of the science enterprise should be at the front of the priority queue for anyone interested in advancing discovery science.
Message sent
Thank you for sharing.