We have all heard it, or read about it, or both. The data scientist is dying and there is little we can do to hold on to our cushy salaries, rock-star-like images, and inflated egos. Obviously, I am overstating things here for dramatic effect, but the message is still anxiety-provoking for many data science professionals who have begun to smell blood in their industrial waters as concepts like “citizen data scientist,” “democratization of analytics,” and “automated machine learning” are being thrown around by more and more executive teams. Such fears were stoked earlier this year when Matt Tucker’s article “The Death of the Data Scientist,” (https://www.datasciencecentral.com/profiles/blogs/the-death-of-the-data-scientist) was published on Data Science Central, though it wasn’t the first to make such a claim (https://insights.dice.com/2014/03/05/data-science-is-dead/). But are data scientists as we know them today, truly a breed bound for extinction? In the remainder of this post, I explore this idea while offering an alternative perspective on what the future may look like for the current data science professional.
The Rise of the Machines
At its most fundamental level, the argument goes something like this; many of the activities of the data scientist are quantifiable or statistical in nature and are, as a result, automatable. Therefore, the better we can orchestrate statistical models together in an automated fashion the less need there is for a data scientist to be pulling on those levers that select, optimize, and deploy their data-driven insights. Indeed, companies and products such as DataRobot, Google’s AutoML, and the ever-expanding access to pre-trained, service-based data science models (Azure Cognitive Services, Google Ai Services, AWS, Watson) have made significant strides to achieve just that, an artificial data scientist.
Therefore, the better we can orchestrate statistical models together in an automated fashion the less need there is for a data scientist to be pulling on those levers that select, optimize, and deploy their data-driven insights.
From Rise to Ride
Despite this dire prognosis of a field that was always poorly defined anyway, those who claim the title of data scientist have nonetheless developed ample skill to evolve with the coming wave of artificial data science. Thus, we must replace our conjured-up images of the Skynets of the world rising to overthrow the last remaining strongholds of human data scientists, with images of explorers riding the hype-wave of artificial intelligence technologies that are fundamentally embedded with skills that only the human data scientist truly understands. To achieve such evolution, there are three areas the practicing data scientist must focus on and that the employer of the future must inspire; an ever evolving/expanding toolkit, the importance of the user experience, and the evangelism of a trade.
The Ever Evolving/Expanding Toolkit
If there is one thing that data scientists are good at, it’s catching a buzz (and a few new tools along the way). The concept of data science itself is a buzz term that many professionals with any statistical understanding in business attached to themselves in order to improve their marketability, and to good effect. Why should we expect the building wave of artificial intelligence to be any different? As the concept of data scientist has evolved, so too have the tools associated with it, and thus the professionals in this field have been caught in a constant race to remain relevant by exposing themselves to the newest tools being made available. Although the rate of change has been near to overwhelming, those who have survived and been able to demonstrate competence around the core functionality of these data science technologies are well poised to take advantage of the tools of artificial intelligence. Thus, the data scientists who learn to evolve will learn how to rebrand themselves as practitioners of artificial intelligence. But to be able to convince others of this rebrand, such professionals will need to continue to expand on their toolkits. Whereas the early 2000’s brought us Hadoop, NoSQL, IoT, Python’s scikit-learn, Tensorflow, and Spark the next generation will be leveraging Ai-as-a-Service, cloud computing, intelligent automation, and containerization for analytics. This means that data scientists must continue to learn how to leverage API calls, architect cloud environments that support data science, and deploy analytics to expose API endpoints.
The Importance of the User Experience
As you can see from above, statistical tools are not the only tools that will help data scientists to survive in this quickly changing landscape. Artificial intelligence is not merely statistical technologies but rather it is the embedding of those statistical technologies into user experiences. Thus, the savvy data science survivalist will identify opportunities to solve problems using embedded statistical analytics. Such efforts will require a greater understanding of software programming concepts, which the data scientist is already well-poised for through the acquisition of open source scripting tools, and the ability to work more closely with application development teams. There are many ways to tackle the user experience problem from both a technical as well as a theoretical (see our previous blog post as one example) perspective and what works will always depend on satisfying the user but the key is to identify strategies whereby statistical models improve the user experience. In this way data scientists will need to continue to evolve their approach to problem solving. Where once we focused on using cutting edge modeling techniques to extract insights from data, we now need to focus on their utility within an application.
Evangelizing a Trade
And finally, because the true test of our data science products depends on the user’s ability to get value from them, we must be prepared to take our specialized understanding of these Ai-enabling technologies and empower the citizen data scientist rather than pontificate over the sacredness of our special anointed knowledge. Despite the apparent ease-of-use promised by the onslaught of automated data science products, citizen data scientists will still lack understanding of their application. As one Reddit user so elegantly put it “most people can barely use Excel, and even most data/business analysts have a hard time understanding anything beyond basic aggregation and statistics” (https://www.reddit.com/r/BusinessIntelligence/comments/8z148d/the_death_of_the_data_scientist_interesting_read/). Thus, businesses will look to data scientists to train the citizen data scientist of the future to use those tools as use cases permit. The reason that data scientists will be required is because data science is not a tool but rather it is a way of thinking and tackling problems. Tools certainly enable new ways of thinking, but people need to be trained on how to think about the tool in order for the tool to change their approach to solving problems. In short, we must evangelize the tools that enable the artificial data scientist. In this vain, data scientists become the hub of both artificial and human data science products within an organization and the citizen data scientists the spokes.
From Data Scientist to Ai Practitioner
In conclusion, the data scientist is not dead, or dying for that matter, but is, instead, in need of a coming evolution. Those who are most successful in continuing to expand their tool kits to leverage Ai services, expose results to and interact with applications, and impart their way of thinking to enable others will be the most confidently poised to meet the coming needs of the Ai practitioner for the future of digital enterprise.