How to Win as a Machine Learning Company

Hi everyone today I want to talk about how data can help you win with your machine learning based business. And before we start let’s make one thing clear too many people are now calling machine learning as AI or artificial intelligence because it just sounds more cool and Artificial intelligence is neither artificial nor intelligent, but we can call things with their right name, which is machine learning. So many companies and startups are building their products and services based on machine learning models and

Maybe you are working in one of these companies. Of course Everyone is trying to offer the best possible solution. And if you are trying to offer the best possible machine learning based solution, you should know what matters here the most. There are several studies showing that if you have a different competing approaches. it’s typically the approach with the most training data that means and not necessarily the one with the best algorithm and If you think about how development of such a system works.

The algorithms are typically public they are either available in some open source library or they are published as a paper where you can take the paper and reproduce this algorithm, Which also means that every company has access to the same algorithms but amount of data to train your model is what makes difference because not everyone has access to the same data and Machine learning algorithms can perform better if they have more data to learn on. This is why big established and data oriented companies have a huge advantage against their new starting competitors.

It’s like a closed loop. The more users you have the more data you can capture. The more data you have the better service or product you can offer and The better your product is the more users you can get again This creates a closed loop and for new companies, it can be extremely hard to get into this loop So having a lot of good data is the number one advantage you can have.

One approach that you can try is to expand your data with the creation of a synthetic data, this can definitely improve your situation, but it has its limits another option to go for is to try to buy such a data from some existing company and If you are lucky you may be able to find a seller to sell you this data If the data you need is generated by users or devices in the real world another approach you can try is so called data crowdsourcing.

For example, if you need people’s web browsing data or their fitness tracker data data from the services they use. You can try to buy the data directly from these users. This approach is a very new one and it can help you to get data others don’t have. Data crowdsourcing can also help you get data your competitors have which they are storing. All you need is the consent of their users. On how data crowdsourcing can work we are in detail. We will have a look next time.

So let me know what you think Are you in need for more data? Or was it easy for you and your company to get the data you need? Which approach have you tried? So let me know.


