Machine Learning: Supervised versus Unsupervised – what’s the difference?

Machine Learning: Supervised versus Unsupervised - what's the difference?

What’s the difference between supervised learning and unsupervised learning? Simply this:

With supervised learning, you tell the system what the correct output is that goes with an input. So, maybe you have a photograph and you say cat is the right label for it and maybe you have another photograph and you say pumpkin is the right label.

As you give it a lot of photographs, all with the labels, pattern-finding algorithms will turn those patterns into a recipe for taking a new photograph without a label and applying a label like cat or pumpkin to it.

When you do unsupervised learning, you don’t give it any labels in advance. You just maybe give it a bunch of photographs and then you say “put similar things together.” A very typical approach is clustering you might say, “I’ve got these photographs… “Create four clusters.”

In other words, create four groups out of these photographs, putting similar things together so that by some measure, everything in one group is kind of similar to itself and in another group things are similar and then what you’re going to do as the analyst, because unsupervised learning is often used for analytics, though you could also use it for straight-up automation and machine learning (a lot of terminology confusion – you know, the same algorithm can be used for different subdisciplines within data science)… anyway, say you’re doing analytics (data-mining) with machine learning.

You’ve asked for these four clusters on this bunch of photographs then you go into the clusters and you look at what photographs were put together and you ask yourself, “Am I inspired?” Is there something interesting about what was grouped together? Does this give me a direction to pursue on this dataset? *vrooooom* Perhaps it does, perhaps it doesn’t.

Maybe the clusters are based on color and maybe that’s not very interesting to you You were hoping that they would be based on the type of animal in there. Who knows. Point is, though, that you’re going to look at the groupings and if you are inspired by what you see you might stop.

If you’re not inspired, you might change the settings in the algorithm. Maybe you’ll say, “No. Define similarity a different way and then make four clusters.” Or maybe you’ll say, “Nah, try six clusters and see what comes out.” What unsupervised learning is is a Rorschach card to help you dream.

You’re going to try things over and over again and see if those groupings inspire you. If they spark some kind of new direction. But be careful! Humans see a lot of patterns that aren’t real, so watch out for that tendency. That’s called apophenia.

You want to make sure that you actually follow up in new data and check whether the groupings or inspiration that you found actually hold up in that new data as well.

Share this post ...

Leave a Reply

Your email address will not be published. Required fields are marked *