5 Things I Wish I Knew Before Becoming A Data Scientist – Today we’re going to talk about 5 things I wish I knew before becoming a data scientist. 3.5 years ago I took the leap into data science from being an aerospace engineer and during that time I learnt a couple of things that’ll be extremely helpful to you if you’re thinking of diving into data science.
Data Science is Not Just Training Models
The first thing i wish i knew is that data science is not just training models. Before I started working as a data scientist I thought that data science was just training models. My naive notion of data science was that you would collect data from somewhere not really sure how and then train your model on that data and then you’d go on to predict a random bunch of outputs and i wasn’t too sure what you would do with the outputs but i thought that the majority of the time in that data science cycle was solely devoted to models and I was really wrong as I got to find out model is less than 10% of the work.
You’ll be working on a lot more than just models and this can include data collection, labeling, data verification, process management and monitoring. This definitely does not sound as exciting as models but that’s the reality if you’re trying to build big scalable systems. Now there’s definitely roles out there if you want to get more experience with models and these are at consulting companies. At those companies you’ll be exposed to a diverse range of industries and projects the projects can last from just a couple of months to about a year.
The downside is that it’s unlikely that your project will be deployed at a large scale but the reality is that building fully fledged AI systems takes a long time and a company might have to dedicate many years of technical effort to get to a point where they have a fully functioning autonomous ml pipeline.
How Important Coding is In Data Science
The second thing i wish i knew is how important coding is in data science. Having a good coding foundation is important at being successful in data science if you’re already interested in data science i’m sure you’ve checked out the online data science courses most notably Machine Learning by Stanford. The main aim of these online courses is to give you a rundown of all of the ml algorithms used in the industry and they might even give you a bunch of code snippets that you can copy and paste to get an algorithm up and running.
Now all of that is fine if you’re getting started with data science but if you really want to level up i’d definitely recommend being proficient in a coding language and a good one to get started with would be python but even if you have a java experience or some good experience in an object-oriented programming language it’s not going to be too hard for you to pick up python.
I’d also recommend learning bash just so you know your way around terminal as well as version control using git one plus point of learning coding is that it allows you to build skills that will be important regardless of whether or not you choose to go into data science as those skills are equally as important in other software engineering disciplines.
Data Science Job Ads Can Be Very Misleading
The third thing I wish I knew is that data science job ads can be very misleading, data science is a very new field. Given the high amount of publicity around this domain a lot of companies want to implement AI in their workplace but aren’t too sure exactly what skills they need or whether they’ve even got enough data to do anything with so what they end up doing is just googling what kind of skills a data scientist often need and they just put all those skills in that job ad.
So you can often find data science job ads that want the candidate to have skills in SQL, Python, Scala, Hive, Hadoop, spark and that’s definitely an alarm bell. Because that just means that they’ve exhaustively listed all the technologies that a data scientist might use and they’re not too sure about the direction of data science in their own company.
so I would focus more on job ads that list a couple of technical skills but also focus on soft skills that you might need such as communication skills or presentation skills and I’d also recommend looking at the overall reputation of the employer who has posted that job ad as well as the culture and both of which you can find out more about on glassdoor reviews, which i found very useful on my hunt for data science jobs and also once you get past that resume review stage definitely make sure to grill the technology teams at the companies you’re interviewing, so you can gauge exactly how serious they are regarding data science and machine learning because this helps you make sure that you end up at a company where you’re going to be surrounded by lots of smart people that you can learn more from.
How Important Communication Skills
The fourth thing I wish I knew before becoming a data scientist is how important communication skills as well as attention to detail are. Communication is important in any field including technical roles because it’s important in making sure that you can work well with your team and you can communicate problems and propose solutions accordingly.
but it’s especially important in the field of data science because data science teams tend to sit between engineering teams and the sales team so they need to be well versed in converting technical details into big picture summary because as a technical person it’s very easy to get bogged down in the details you have to remember that you’re working on a product that you’re gonna be serving to a client who might not be as technical as you are and sometimes in other technical professions you can get away with not interacting with the client so much but this is definitely not the case in data science.
The second thing I talked about is attention to detail this is important for a couple of reasons one of them is that good attention to detail can help you figure out the shortcomings of your training data which helps you down the line in making your model perform better and it’ll also help you highlight the edge cases that your model fails for so it can’t be understated.
how important attention to detail is in data science especially if you’re at the cutting edge and you’re trying to squeeze out every single percent to level up your model accuracy.
The Different Types of Data Science Jobs
The fifth thing I wish I knew are the different types of data science jobs out there. So I’ve got three categories for those the first is product enhancement with data science. So that’s improving an existing product in the company that you’re working at with ML techniques so it could be ML working behind the scenes of the product in a non-obvious way so it could be something like google search or it could be for detection on credit cards using anomaly detection.
The second type of data science jobs is ml as a product so that could be ml which serves as the sole product for a particular application. So a good example of this is autonomous driving. This is primarily done by image segmentation on the fly as the car is driving. So the sole product there is machine learning as autonomous driving would not be possible without it. The third type of data science job is in decision making now that could be a role of an insights analyst so this is a type of data scientist that’ll be working inside of a company to figure out how best to optimize the different processes in order to cut cost and increase revenue.
An example of this could be analyzing the product usage of your clients to figure out what the best way is to lay the features out to improve usage. Another example could be to use machine learning to figure out if any cost in the finance department is blowing out of proportion compared to five years ago and that could be a simple form of forecasting. That wraps up five things I wish I knew before becoming a data scientist.
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!