# Decision Boundary of Neural Nets

Decision Boundary of Neural Nets – The topic for today’s article is the decision boundary of a neural network. So in the last article that we saw the Matrix representations and many other different forms that we may use in deep learning. So in today’s article, let’s see how a decision boundary looks like in a neural network. Now we have learnt in different algorithms including decision trees and the linear regression and many other it is we have a Specific kind of decision boundary for each of the data set and it depends upon how the input distribution is.

So in a neural network, there is no particular kind of decision boundary that is in general for each and every instance that which it incurs instead as in when how the data comes it will learn the decision boundary as it is, so say for example now since I have this axis and say if I have some instances say Like this and I have some more instances like this. There it is. This is a two-class problem. So now what you can see is the distribution of the data.

That is homoscedastic means that is uniform across the two separations. Now if I ask you like whether I can draw a decision boundary that can well separate these two different classes so I can essentially say like I can draw something a decision boundary like this where the equation of this is equal to Y is equal to of form X So where y minus X is less than 0 on this side and Y minus X is greater than 0 for this other gloves.

So essentially in two Dimension, you can see this as a line, but in three dimensions, you can assume this as a plane that is separating these two so whatever instance lies on the upper side that belongs to this Cross or the black class and whichever Falls below this line that belongs to blue class now say for instance. We have some another setup. So in the real world, you don’t have the data set all in a linear formation. So there can be non-linearity also, so that’s why we introduced the nonlinear part. So we’ll come into that a bit later.

So now say our distribution, is say like we have now the point’s lead in this way and we have some more points which are The same of this class Now, this class is somewhat homoscedastic means it is uniform but the lower class which is the that is for the blue. It’s not uniform but there is some kind of particular curvature that is following. So essentially We cannot put a straight line that passes through or that separates these two classes.

So essentially what we can have is we can have a decision boundary something like this. Now, this particular decision boundary is not Linear, so it is a parabolic curve. So we are mainly doing the curve fitting that is of the form Y is equal to x square. So essentially you can see that this particular looks like a ball shape in three dimensions. So where for the cost function for linear regression. You have the slope and the intercept for a neural network.

You have the weights as well as the biases. So the neural network will also learn this kind of decision boundary and then you have some another set up say you have some points which are there inside this And say you have some points each other outside say just take the triangle class. This is how the data distribution is. Now. You cannot draw a straight line which separates these two neither. You can draw a parabolic curve fitting. So what you can essentially do is you can draw a circle. So which is of the form x square plus y square is equal to a square for some constant and for the classes which lie inside x square plus y Square minus a square is less than 0 and for outside.

This equation becomes X. Plus v Square minus a square is for this inside less than outside. It is greater than 0 that is four points lying on the circle for that you have this equation of a circle and four points that lie inside the radius or this particular value is less than 0 and for outside is greater than zero. So now you can see different types of decision boundaries, are there across different. Algorithm. So you need to make your neural network learn.

These are all decision boundaries. And so if your equation if you remember and that is we have that is the linear part when we compute W1 X1 plus W2 X2 up till you have wnxn in so this was the linear part that we were Computing in this particular unit. So then we have some nonlinear function. So mainly we would use sigmoid function so that it transforms into some essential formation.

So now say you are you willing to work is just learning only this particular linear function, but then now every and each and every time it is classifying your instances in this way and if some distribution comes like this, then it is unable to classify this, why because it is just only learning your linear part. So that’s why you need to have a nonlinear function. So So we drew some equations x 1 x 2 then we had the bias as well that is for adjustment.

And then we compute the output that is the estimate so having a linear function and nonlinear function. Both are an indispensable part of a neural network, so you can’t just take away the nonlinear part from this unit. Like the human brain, so if you have only the right brain and if you don’t have the left bring them that becomes difficult for you to survive. So similar is the case with the neuron it won’t survive through but then it will collapse immediately after a few competitions or neither.

It will not do any computations at all. So I’ll just give you one more example say we have some mathematical equations like we have said W is represented in terms of a x. Then say you have X in terms of v y and Y in terms of c z now. I’m asking you to represent W in terms of Z. So how would you do that? So first you will take this equation in place of X, you would replace it by b y and in place of Y you would represent by z.

So assume all these ABC’s are constant so say just consider this is number 3 into 2 into 1 that becomes 6, so w is nothing but six times of Z. So this is a linear function now in the real world say our equation is of the form Z is equal to W 1 x 1 plus W 2 x Cube W 3 x 4 And so on W NX n if you have these kinds of representations or this is not a monomial but this is a polynomial. So if your neural network is not able to learn these things then it becomes difficult for you to transform or train your neural network for each and every scenario, then you do change that architecture which we have seen and which essentially affects your generalization of a neural network.

So learning these kinds of functions becomes essential. So before the era of deep learning if these things were coming into your equation then how they would treat is by using something called feature engineering. So we’re what did is they try to replace these coefficients which were not linear in the input. They try to pass it to some function to make it linear and then they would put into your function but in the current deep learning techniques or strategies, we don’t do feature learning.

He’s in learning is not our feature engineering is not a part of current deep learning trends. So all the inputs are considered as linear and then we do it in some nonlinear function by using sigmoid or other functions like rail you are and it’s so what we mainly see is we have some input layer and we have the output layer and then we have n number of hidden layers so up till one to of up till n hidden layers.

Is so it can be up till any number for the complexity. Now if your neural network is just only learning or just taking the inputs and just multiplying it with the weight Vector then what is the purpose of your neural network? It is not merely learning anything just it is multiplying and there is just adding up with all the other weights so that this just you can do with your simple math also, but in order to learn the non-linearity present in particular data set how distribution is how it would change in a particular period, you need to have this non-linearity part also, so it becomes an indispensable part of your neural network in learning so that it can learn each and every decision boundary which is there and which is offered by other machine learning algorithms.

That is by an SVM or if you have some other curves parabolic curve like linear regression decision trees and many other so well, that was all regarding the decision boundaries of neural networks.