Introduction to Neural Networks – I have got many of the votes from the pole, which I have put two days before for deep learning and neural networks. So I decided that I will be starting off with the Deep learning and neural networks. So welcome to the first lecture on deep learning and neural networks. So this is a very general introduction regarding what will be a neural network and what is deep learning and why we required in today’s world. So I have made a couple of few articles before in my machine learning category.
You can find a video called artificial neural networks. So that is a predecessor or you can just consider the prerequisite for this particular course or for this deep learning. So I’ll just put the link in the description below you can check that video out or I will put that in the cards up so you can check that. So in that, I have discussed a few different concepts and how it just looks like and what are the different mathematical functions it’s there.
So there is also an implementation that will be doing in TensorFlow. So that will not be held in the current times. But after a few months since I am also doing practical Hands-On on that so it takes time. So just, for the time being, we’ll be having a few. Two articles on different concepts like activation functions and some Delta rule chain rule, differentiation, partial derivatives and so on will be dealing with that and later.
We will do some practical Hands-On on TensorFlow and others like PyTorch or whichever is comfortable. So let’s begin. So today’s article is regarding the Deep learning and neural network. So you can consider in what way you like whether it’s neural networks or deep learning. I’ll just give you a gentle Idea of what these things are basically so this particular Topic in deep learning and neural networks. So this topic is not currently being developed or so.
This is there from a few time before so in the early 1940s and 50s these things were there. So, but at that time we did not have the hardware or the machine which we currently have since in the today’s world will have high Computing machines and Hardware with different configurations different solid-state drives DDR3 and DDR4 different systems like gaming PCs and laptops at the old time.
We did not hide that high computing power in order to carry out this particular activities. So in the older time, if we remember we had these punch card systems where for typing different instructions are different carrying out different sequential instructions. We used to put that Punch Cards. So in today’s world that punch card is equivalent for you to write a program in R or python.
So you can consider in a similar way. So we have evolved and come across so many years. So now in the present times, we have this hardware and system so that we can do these things more efficiently. So to talk about the very first picture or the very first notion of a neuron or neural network came from the biological neuron.
From the human brain. So this particular idea of generating this kind of systems came from the biological neural network. So you have the human brain different kinds of axons and dendrites and synapses ETC. So that’s another topic for biological neural. I will not go into the details of this biology of how the human brain works, but that’s just another topic. So just the idea of this neural network came from this. So to talk about the very first neuron was put forth by MCP that is McCulloch and Pitts neuron, so they have put for the neuron to check whether these neurons can Implement simple Boolean Gates.
So we have this and or and different Gates like that. So what we basically have is we have this unit. So we don’t call this as a neuron but we just coin this as the terminal unit. So what will basically give us we give two inputs X1 and X2 and this particular? Unit is consists of two parts first have a linear part. And then you have a nonlinear part will come down the line why we require this to different parts and why nonlinear part is most important in your neural networks.
So this linear part is essentially summation. So where you have all these inputs you multiply with some coefficients. And you just transform this and here what you have is a step function just for the time being we’ll just consider this step function. And what we have is we have outputs a y cap. So what I’ve said is I have some things here. So these are essentially called Weights so what you do is when you just put into this system, it will just produce the multiplication. So it will have W1 X1 plus W2 X2. So now your input goes in this way.
So it is the product of the sum of all the weights and the inputs that are going to the system and it will just convert it to some form into some step function form and it will produce the output So w cap is W and x 1 plus W 2 x 2 and you have some terms here. So now essentially this particular equation is not new to you. So we have already seen what is linear regression and what are the various components in that? So this is essentially Y is equal to MX plus C. So C, in this case, is considered as a bias. So that is an external configuration.
We apply so that’s not a part of your neural network. Just for your understanding say you have b0 or B. And with Buyers also, you have some weight w 3 so essentially you can consider this. So your X Remains the Same but your m is replaced by this W’s so these are essentially your weights or you can call as the coefficients. So whenever you do linear regression, those are called as this Loops, but when you consider a neural network These are called as weights. So essentially with this what you have is you get an output. So that is called as you predicted.
And then what you have is you have the actual output y. So, what you do is y minus y predicted Square you do the sum of squared errors. So that gives your error or it is also written as J. That is the cost. So this J stands for Jacobian. So J is essentially a function of your M and C so that we saw from linear regression. So essentially here also, it’s a function of your bias well as your weights. So weights and biases So essentially what we do is we have a now we have obtained output and we compare it with the output we already have now if there is a huge difference in that then what we try to do is we just go back to this particular neuron.
We adjust these weights. We will not touch our input. So input is in the original form that we desired to give to our neural network. So what we do is we are Changing this vid. So essentially whenever we are constructing a neuron or neural networks. We are learning weights. And so if you remember we have a when we did gradient descent. So what we essentially had is we had some parameter so that is called Lambda.
So this is nothing but the learning rates. So we want to learn this particular weights so that our error that is produced at the end this minimal or it converges to 0 but it cannot be always 0 but it can converge to 0 that is approximately equal to 0 so this is the end output that we want to achieve. So the current world is after these MCP neurons. What we currently have is we have the perceptron. So perceptron is the most advanced neuron that we basically work on obviously in the real world in today’s world. What we have is we have Google Inception.
And then you have Microsoft ResNet that is the residual neural network. So these are the two most prominent neural network that is used in today’s world. But all of those have the origins from the perceptron so in perceptron you add a bias you try to improve your function and which acts like your C that is your intercept. So this is the main thing or is the key point when you have to construct your neural network, so this B is nothing but your intercept which you just adjust your slope so that all the instances can be linearly separable.
Now, what did it is since their target goal was to implement a Boolean gate. So what this started with us, they had this AND gate. So we have x 1 and x 2 and Y. So when we have both 1 you get 1 0 1 you get 0 1 0 you get 0 and 0 0 you get one. Zero So if you want to mimic this particular Boolean activity in terms of the neural network, what we basically do is we draw one neuron or unit. You have x 1 and x 2 so now here I’m not considering any weights. So essentially this would be the summation of all of this so you can consider 1 and 0 so any combination would produce your output so it will give you any of this function.
That is why so what is the part here? That is the nonlinear part that is greater than 1 so if it is greater than 1 then you have to fire this accordingly then what you have is you have this OR gate. So in OR gate also, the scenario is similar, but here you have this whenever you have one that is one and So in this case when 1 it is 1 0 1 it is 1 and for this, it is 1 4 0 0 it is 0 so your neural network just changes like you have greater than or equal to 1 and you have x 1 1 and 0 these are the two inputs that I’m providing and I have y-cap. So this portion we could implement it perfectly.
But what is the problem with this perception? Is that these cannot implement the xor gates. So how the xor gate looks like is the Y predicted? So if both the inputs are 1 then it is 1 if it is 0 then it is 0 else it is. Well if both these symbols are same then you trigger a one else in the other case you trigger zero so they cannot or they could not implement this particular functionality, or they could not mimic this particular thing in kind of this perceptron.
So this was the major challenge that was there in the older days. So in order to understand this geometrically say you have these two instances. So instead saved like you have two of these planes and say you have two classes so for positive I’m considering a triangle and for 0 I’m considering Circle. So now what happens is that you cannot differentiate this between a linearly separable line. So there cannot be a hyperplane that it is possible to separate these instances or in fact if you have zero here and if you have this would be perfect.
So you cannot separate this by drawing a linear line. So what they essentially did is they just projected to a Higher Dimension. So what they did is they constructed a hyperplane. Having different orientations of this plane in that space. So now by adjusting these different hyperplanes with the help of this bias. So now this bias comes into picture where you can adjust this hyperplane, so by this, they can change this or just make this linearly separable, which was not the case of two Dimension they could achieve in three dimensions.
And so these neural networks or artificial Neural networks are also called Universal Function Approximators. So whenever if you have two instances are a number of instances, which are not separable in lower Dimension, then it is projected to higher Dimension. So this power of from where the neural network and just project the points or space to lower Dimension to higher Dimension comes from two theorems in mathematics. And those are called as Cover’s theorem.
And the other is called as Kolmogorov’s So from these two places these neural networks are called as a universal function approximator. So you cannot see this kind of approach in any other algorithms that we have seen in the machine learning like decision trees are singular value decomposition or any other techniques. So that’s why these neural networks are also called as the Black Box Machines So assume like I have this setup.
So I know this is a very bad example to show but say these two are inputs that are going to this neuron. So consider this as a unit or a neural unit and this is the output so whenever you pass on this to you don’t know what’s happening inside. So it is a black box. So you are so you cannot see what is there you have these two ideas of like these two inputs are going into the system and you get some output in response to that so You don’t know what’s happening inside. So but in case of other naive Bayes or a decision tree, so what you can see is those are white box machine learning algorithm.
So, you know, what is the thing that is or how the machine takes a decision in producing the output button case of neural networks. You cannot guarantee like what’s happening inside. So essentially these kinds of neural networks cannot be deployed onto Productions like for predicting stock or for giving a loan of a particular customer or four Providing drugs to a particular patient.
So then that becomes a very questionable scenario like on what basis you provided that drug is predicted that particular thing so then it becomes debatable so often, in that case, you don’t go with this neural networks, but instead, you go with some other machine learning algorithms like SVM or you consider the naive Bayes or decision to in order to perform that activity So well, this was all regarding a gentle introduction regarding deep learning neural networks.
And so you have many more topics coming with this like have the activation functions so I have not talked about the nonlinear part. So that will be considered in the activation functions. You have many activation functions. Then you have the gradient descent. Then you have the delta rule, chain rule ETC that will be coming in the so forthcoming articles. So was all a gentle introduction regarding the Deep learning and neural networks.
reference – Introduction to Neural Networks
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!