As someone from a web development background myself who’s been learning about data science, AI and machine learning, I wanted to clarify some of the concepts and definitions which I personally found a little confusing along the way.
What is the Difference Between AI, Machine Learning and Data Science?
So let’s talk about the differences between data science, AI and machine learning. So data science generally is the solving, solving complex problems using data, so this could cover things such as analytics mining, visualisations, statistics and more.
Artificial intelligence is the simulation of a human brain function by machines, so this would cover perception, so vision, touch, hearing, actions and movements, so robotics and the ability to move and manipulate objects, natural language processing, so speech and text, planning, playing chess and predicting moves, and reasoning and knowledge, for example, IBM Watson playing the quiz show, Jeopardy.
Machine learning involves looking at data and finding insights without specifically being told what to look for, so this is different to traditional computer algorithms because it specifically involves learning. So it’s generally accepted that a subfield of both data science and AI is machine learning. Machine learning itself
has three main subfields, supervised, unsupervised and reinforcement learning.
And whilst there are many machine learning algorithms which solve specific problems, the main thing which you should focus on when you’re starting your journey into machine learning is artificial neural networks.
Now I have no doubt that many of you have already heard of artificial neural networks, perhaps you think they sound interesting, you’d like to know more or maybe you’ve already even been reading or learning about them already. I can tell you that artificial neural networks are the most exciting thing within the field of machine learning right now.
They’re incredibly general, sorry, they’re incredibly powerful, and they’re even moving into other AI subdomains such as natural language processing and robotics where previously applied algorithms are being replaced with neural networks to great success. So they’ve actually been theorised since the 1950s but it’s only recently that we’ve had enough processing power to actually get them to work.
So artificial neural networks are inspired by biological neural networks, so they’re based on the human brain and how we think. So this is what a real biological neuron looks like, the dendrites are the inputs, the axons are the outputs. When enough dendrites fire, the axons fire outwards. So your brain has around 100 billion of these neurons all connected together in a neural network.
So an artificial neuron works much the same, it’s essentially an algorithm which receives a set of values, and if those values are high enough then it will activate. So in this case, we have two inputs, 0.3 and 0.7, and then the next step is to assign those inputs weights which represents how important they are.
Usually, especially using a library such as TensorFlow, you would initialise all of these weights randomly, and then they would be automatically adjusted to reflect the importance of the inputs based on the findings of the neural network. So you’d also add a bias to inputs with zero, and then we have something called the activation function which is gonna make the neuron fire or activate if the incoming inputs and weights reach a certain threshold.
So the inputs get multiplied by their weights and then added together and passed into the activation function. So for now, our activation function’s gonna be really simple, if the sum of the inputs times the weights is positive, the neuron will activate, and if it’s negative, it will not.
So in this case, the neuron receives a positive value of 2.7 and so it will activate, the activation returns a one, not a zero. So this activation function is called a threshold function and it looks like this. The problem with this activation function is that you only need to be a tiny amount either side of this point in the middle to yield completely different results.
So there’s many activation functions and other common ones are Sigmoid and ReLU, both of which allow smaller changes to be taken into account when deciding whether a neuron activates or not. So if you connect many neurons together in rows and layers, you make a neural network.
Here we have four layers. The purple layer is our input layer which represents the real data, so, if we were predicting house prices, this might represent the number of rooms, the age of the house, the square footage, and so on.
The red layer on the right is your output layer, in this case, we only have one output which is the price of the house. So we also have two hidden layers, the blue and the green layers, which receive inputs and pass outputs. These two hidden layers are dense layers which means they’re fully connected.
So again, we would assign weights to each input, the data flows from the input layer, hits the neuron in the next layer, and then if activated based on the weights, it will send the outputs forward to the connected neurons in the next layer. When training a neural network, and you need to train a neural network for it to be useful, in most cases, you would already know the expected result, which is known as supervised learning.
You can then use something called a loss function to calculate how far off the model’s output is from the correct output. In this case, the model predicted three but we were expecting eight. Then we would use something called an optimizer algorithm which is the thing doing the actual training.
Its job is to go back and update the weights and biases in order to get the outputs closer to the expected results. So you would repeat this a bunch of times with many sets of inputs and expected outputs, and eventually you get a trained model which can receive inputs and give accurate outputs.
So why all the hype with neural networks? Well, they’re actually quite general, you can learn the basis and then apply the same techniques and tools to a range of other different problems, with only a little tweaking and not much domain-specific knowledge required.
Neural networks work best with labelled data and more and more labelled data is becoming available every day. This means that developers, researchers and businesses can make use of the ever-increasing computing power and available data, and use neural networks to find patterns which they never would have been able to by just searching and analysing the data themselves.
So how do we actually carry out machine learning? I previously demonstrated how an artificial neural network works but what about the technology we, as developers, can use to implement this?
Before I talk more about TensorFlow.js, let’s look at what TensorFlow is. So TensorFlow is an incredibly powerful machine learning and deep learning library. It allows data flow programming across a range of tasks, and, as well as being a symbolic maths library, it’s also used for machine learning applications such as neural networks.
TensorFlow was developed by Google Brain for internal use and it was open sourced in November 2015. Since then, it has quickly become one of the most popular machine learning frameworks on the scene. TensorFlow’s GitHub repo has more than three times the numbers of stars compared to scikit-learn, which is the next most starred machine learning project.
So TensorFlow allows developers to create large-scale neural networks with many layers, and can process and create models for things such as voice and sound recognition, language detection and a whole lot more. NASA used TensorFlow to find new planets orbiting stars, and more recently, students have been using TensorFlow to map craters to try to figure out where matter has existed in various places and across different times, to try and understand the very origins of our solar system.
It’s even being used to prevent illegal deforestation in the Amazon. Solar-powered upcycled phones are hidden high up in the trees and they’re trained using TensorFlow to detect the sounds of chainsaws and logging trucks and alert the rangers who police the forest.
Well firstly, because of the popularity of the TensorFlow playground, so this is an in-browser interactive visualisation of a neural network, and you can see how adding layers and neurons work and how changing these yield different results. For example, here we’re trying to predict how to separate the yellow and the blue dots, and at first it doesn’t do great but then if you add more neurons and layers then the model will have no problem in finding that separation.
So the code used to make this website is all open source, there’s a repo on the TensorFlow GitHub page, and this was turned into a library called deeplearn.js, which has now become TensorFlow.js, so TensorFlow.js is just the next iteration of that work. deeplearn.js was only released in August 2017, and in the short time before joining the TensorFlow family, they released a bunch of demos, such as Style Transfer, where you can apply the style of famous artists’ work to any photo.
So this here, it’s applying the style of Francis Picabia to a photo of Scarlett Johansson. deeplearn.js also created Teachable Machine where you can train a neural network to use your computer’s camera with things such as a hand or head movement to trigger the loading of set sounds or images.
So you capture frames while holding down one of the coloured train buttons, while doing the action, and then after the training, the neural network will trigger one of the outputs on the right when you repeat this action. So this demo’s really cool, actually, and it’s on the TensorFlow.js website if you want to have a play.
So TensorFlow.js is able to train models in the browser, and it uses WebGL, which is an API used by browsers to access your graphics card or your GPU. Running in the browser means no setting up drivers or instals, and it works across all devices.
So what can you do with TensorFlow.js?
A lot of them let you interact by drawing, uploading an image, or accessing your camera for image detection or even for capturing your movements as a controller. And there’s also games, such as this one.
My friend Asim who was very keen to be in this demo because he had just bought this new hat, and he is very proud of his hat. So Asim is the other co-creator of aijs.rocks, and here he’s capturing images using his webcam ready to play the Pac-Man game.
So the app loads a model which has already been trained but it requires the user to do this additional training, so the model can update itself, so here you can see the model’s capturing extra frames for up, down, left and right, and then it gets trained with these right inside the browser, and then you can play the game using movements as controllers.
So another fun demo I want to show you is Move Mirror, made by the smart people at the Google Creative Lab, which lets you explore pictures in a fun way. You turn on your webcam and move around, and the camera pulls up pictures of poses that match yours in real time from a database of more than 80,000 images. And of course, here’s Asim again trying it out, in his hat. So this game uses a pretrained model called PoseNet which detects positions of 17 points in the body such as eyes, ears, wrists and knees.
It runs entirely within the browser and it can be used with any webcam. So I also want to show you Sketch RNN which is a generative recurrent neural network capable of producing sketches of common objects. So as you can see here, I can start drawing by adding this one line, and then the model will continue my drawing.
I’ve selected a bird here but there’s a number of different subjects to choose from. So this was made by David Ha with the goal of training a machine to draw and generalise abstract concepts in a similar way to humans. So it was trained on a data set of hand-drawn sketches, and it has the potential to help artists with their work or to help people learn to draw.
You can use npm or yarn, or you can link to the CDN in a script tag right inside your HTML file, and then you have access to the global TF object where you can carry out various methods and operations. For training models, the library consists of two different packages, the Core API and the Layers API.
A tensor is the central unit of data in TensorFlow.js, and it’s basically a multi-dimensional array of numbers, an n-dimensional array. You can create a tensor by doing tf.tensor and passing it an array of values and a shape of rows or columns, or you can infer the shape with nested arrays.
So this is an example of doing polynomial regression, which is similar to lineal regression but a curve, and here we are predicting what the y value will be, knowing the value of x, so that we can plot it on a graph. So this is not deep learning, regression is generally referred to as shallow learning.
Setting up layers for a neural network with these maths operations is a lot more complicated, so let’s take a look at the Layers API. So this is a higher level, Keras-inspired Layers API, and it makes it a whole lot easier to build and train models.
So you can create a neural network with pre-constructed layers, such as in this example. Here we create a dense layer, which means it’s fully connected, and we pass a config object to it with some data such as the number of units for the dimensionality of the output space, and the activation function, in this case, we’re gonna use ReLU, if you remember what that looks like from our neural network intro.
So the activation function is going to tell each neuron in this layer whether to activate or not based on the inputs and inputs’ weights. We also need to record how far off of the expected output we are using a loss function, mean squared error is a common and straightforward loss function so we’ll use that.
And then we need to use an optimizer which is going to go back and adjust the weights, so hopefully, when the model iterates over and over again, it will get a lower loss rate and the outputs will be closer to the expected ones. So you would call model.compile with your loss function and your optimizer, and model.fit with your data, x as your data, y as the corresponding labels.
Simply wrap your functions with a tf.tidy, or you can use dispose, which is similar, but it’s called directly on the tensors or variables, and then your browser won’t crash because of a memory leak. So I’ve talked about how to train a model with TensorFlow.js, so let’s see how to load a pretrained model straight into the browser.
With a pretrained model, we simply call tf.loadModel and pass a URL. So this is a URL to a pretrained model called MobileNet, and MobileNet is an open-source convolutional neural network architecture relating to image recognition. There are many types of neural network, but whenever you here convolutional neural network, think image recognition.
MobileNet was trained using a large subset of a huge image database called ImageNet, and it’s optimised to perform image detection on mobile devices and embedded applications. So there’s more info about MobileNet on the TensorFlow GitHub page under Examples, and it’s a really interesting one to have a play with around.
You don’t need much, well you don’t need any prior machine learning knowledge, you simply load the model and then you pass it an image, video or canvas element, and then it will return an array of the most likely predictions and the percentage of confidence. So as well as training a model and loading a pretrained model, the third thing you can do with TensorFlow is transfer learning.
So this is where you train the tail end of an existing model. Using the MobileNet example once again, you would load the model and set it to a variable and then call getLayer on it and pass the name of the layer. Once you have this layer, you can create your own model, just like we did in the previous example, and train it with your own data.
If you remember, inference mode means running a model which has already been trained. So the standard Python TensorFlow, when used with a powerful CUDA graphics card, gets less than three milliseconds, and just using the CPU of something like a MacBook Pro would take around 60 milliseconds.
Compared to TensorFlow.js running with the fancy graphics card results in just under 11 milliseconds, and for running on the integrated graphics card of a laptop, it’s around 100 milliseconds. But these are milliseconds, so it’s important to realise that 100 milliseconds is not actually bad at all. It’s only going to improve as both the technology and the web improve, and you can already really build some amazing applications with this.
There are, of course, a bunch of interactive demos you could have a play with on aijs.rocks, and I also want to mention ml5.js, which is a wrapper around TensorFlow.js which makes it even easier to load and access pretrained models in your browser.
Well, it’s certainly bright, the examples I’ve shown are mostly fun or creative rather than solving real-world or enterprise problems, but I think we’ll see more examples of solving these problems over the coming months. There’s a lot of potential to build accessible apps using head movements or voice as controllers, for example, and also potential to build privacy-friendly apps, running models in the browser means that your user data doesn’t have to be sent server side anywhere.
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!