What Is Data Science, and What Does a Data Scientist Do? – What is data science and what do data scientists do? Stick around and you’ll find out. So what is data science and the role that Harvard Business Review called the sexiest job of the 21st century? The word science in the phrase data science means using the scientific method to turn data into value.
That means asking the right questions, creating hypotheses, and then devising experiments to test these hypotheses. Ultimately, these experiments usually result in conclusions, discoveries and inventions. And in the case of AI and machine learning, in predictive and prescripted models.
In an article of mine, I define what I call the pillars of data science expertise. And in an ideal world or unicorn-like scenario, these pillars represent the areas that data scientists should be an expert. They are business or domain expertise. So think about an MBA type person that’s also an expert in some industry or domain. Mathematics expertise, so things like statistics and probability.
Computer science expertise, so think, you know, software architecture and engineering and communication expertise, so things like written and verbal communication, so that data scientists can deliver results and conclusions and reports and findings to senior executives and so on. In reality, people are usually strong in one or two of these pillars, but not equally as strong in all four.
If you do find yourself a data scientist that’s equally as strong in all four of these pillars, then you’ve done a great job and you found a unicorn. So basically, based on these pillars, a data scientist is somebody that’s able to extract meaningful information and insights from existing data sources, as well as identify and use new data sources in order to help support and drive business decisions and actions that ultimately achieve business goals.
And ultimately this is done using business domain expertise, along with effective communication, and having the ability to use any and all relevant programming languages, software packages and libraries, data infrastructure, and so on. The process the data scientists use to turn data into value is pretty similar and common across the board. Although have different models and names associated with them. One of the more common models is called Crisp-DM. And of course, use the comments below to let us know of any other process models that you’ve used or recommend.
I introduce a model that I created called the Gabdo process model. And you know, I regularly work with executives, managers, entrepreneurs, and so on. And so I created the model just because I think it’s better suited to that type of audience. But there’s absolutely nothing wrong with the Crisp-DM model or any other model like it. The Gabdo process model is a model that consists of five iterative phases which are goals, acquire, build, deliver and optimize.
And the reason they’re iterative is that any one of these phases can lead back to one or more of the phases before it. And a lot of that comes from the experimental and exploratory sort of scientific nature of data science, as we’ve already discussed.
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!