# Weight Matrix

Weight Matrix – The topic for today’s article is representing weights and different layers of a neural network. So in the last article, we saw what was the Deep learning concept and some of the neural network and we saw some of these structure of a neuron how it looks like and we’re in different parts of our neural network it fits. So representing weights and different and each and every layer becomes a very crucial point when you To some kind of matrix multiplication or whenever you do some processing in the neural network.

So you need to know how these kinds of stuff actually work. So for that purpose, I have made this particular article. So let’s begin. So essentially say we have four different inputs. So we have x 1 x 2 x 3 and x 4 And then we have say some neurons here. So we have one two and three neurons and then we have say two more neurons at this and finally, we have one neuron, which will just consolidate all the processing and then it will give the final output. So say that this represented by some why estimate by cap.

Now this particular portion that is this part which I represented in. Dotted rectangle this is my input layer. I represent it as IL and this particular. Lair Will be my output layer who else so whatever is there except the input layer and the output layer those becomes my hidden layer so I can clearly see I can have two different hidden layers. So I represent it as HL and now since I have two different hidden layers, I need to be very specific or I need to be very precise whenever I do the computation, so that’s why I represent this as That is the first hidden layer that is for one and this is a second hidden layer.

It is Excel and this is a superscript 2 in square bracket. So This is the frame. Just to represent that this is a layer now, you need to connect this so far different connections save this particular neuron or unit is getting input from this from this from this and so we also know that inside each neuron. There are two sections first in compute some linear summation. So similarly you have here and here and here also. And then you have some function.

That is non-linear pattern. So, for the time being, we are not considering any biases for this particular scenario. Just we want to represent our weights at different layers how it looks like and how it is actually computed. So since we know the linear part gets input from all the four inputs that are X1 X2 X3 X4. Let’s represent that by Z1. That is the first output. So this is essentially a linear product. So the product of summation of all the weight vectors. So be you have X1 plus X2 plus X3 plus X4. So with each of this, you have some dates as well.

Now, how are those weights? So get this the first weight W1 that is for this and which input W 1 1 W 1 2. W13 w14 so now you are very much precise about which weights are provided to which neuron so now this particular inputs will be different for this particular neuron, so it won’t be the same weights. So again, you have some weights. So this essentially gets input from all of this.

So now at this particular node or unit, you have the equation so it will be easy to now we are talking about the first layer. So we must not forget to write this else will be confused. So it also has these four inputs and say weights are also there with this. So now this weight is Now the second wait, so that’s why W do for the first or the second for the third for the fourth. So similarly for Z 3. You have X1 plus X2 plus X3 plus X4. You have W 3 W 3 W 3 W 3 this for first second third and fourth.

So now whatever we have done is you can just see that the transformation is just only linear transformation that has happened at this particular formation. Now, it has to go through the second part. That is the nonlinear function. So from there you basically get the output. So let’s represent the output by a so output for the first represented at the first layer. So it is like this and in the second layer Say you have a two. And a three and one So now the output needs to be transformed in some function.

So how do you do that? So since this is the linear part when you do the transformation Sigma Z 1 of 1 This essentially gives you a 1 of 1 similarly Sigma Z 2 of 1. A to of and sigma Z 3 of man A three of one  so these are the outputs that are given out after this first processing at the hidden layer. So these are essentially your outputs that you will be getting.

So now you move to the second hidden layer that is at this part. So similarly for this connection that is you have x 1 x 2 x 3 and x 4 then at this second hidden layer for this particular neuron, you basically have Similar connection with different weights. So how do you represent this now? Your function would be this? So this would be the input that is getting to this particular neuron.

So how do you represent that so now say we are consolidating this so let’s represent this as Z 1 that is capital z That is the outputs all the outputs from the first layer, which is comprised of a 1-1 a 2-1 a 3-1. Now let us write for that is Z 2 N 1 that is small to L, which is the output of this. So for that first, we need to write the linear transformation. So how do you write that? So now what you have is basically you’re getting three inputs. So those are basically now A 1 1 plus a Door Plus a 3-1 and accordingly you have some weights. So that is first weight with the first one second one and the third one.

Similarly, for this that is Z2, this is for the second layer. This had one for the first and Set to for the second layer. So whatever I write in the upper side that represents the layer that is the superscript. So similarly you have a 1 1 1. Where do I A three-way W WN W. So you have the second second second. So one two and three so now let us call this as Z 2. So outputs from the second layer now, we need to also transform this.

So how do we transform this? So Sigma Z 1 2 So that you get us A12 Now since this is a linear product now, I need to transform into a nonlinear function so that so I am using my sigmoid function and with that, I get this and similarly you have one more that is Sigma Z 2 2 You are transforming this to a tutu. So we get the Consolidated after this processing whatever you get to this or which goes to this. Is the product of the individual weights so you have this output that is Z2 and Z1. So whatever you get at this particular portion.

So here at this particular portion were to basically get us now. You want to represent these head net? So-net I basically represent after computation from all the different layers. So after this layer and this layer, I basically have Waters I am getting the output. So my output us say I am representing for a 1 a 1 is the output after I get this and A2 is output that I get after this so plus A1.

And accordingly it to and accordingly I have some grades here so that it is first and first and second since we have only one output, this is where binary classification we are just representing this way. So now you can get to know how the matrix multiplication and different representations are seen at different layers now.

You can see how tedious also this particular thing is like you have to backtrack and you have to see how you can multiply this. So This essentially becomes a very difficult activity first we need to multiply and then you need to put into a nonlinear function for a presentation in a specified output from so by not doing all those things. What we basically do is we basically represent this in a matrix representation. So we basically put all the weights. That is we are getting from the first layer second layer third layer and so on.

We are putting in two different weights and we multiply with the input since you can see input each of these particular layers different output layers, it remains a constant. So what changes is basically our weights. So now what we do is we do the activity of representing our Wait So what I basically have is so weight from the first layer. So now I basically right is so this is one of the most important things like you have to determine what will be the size of the weight Vector one particular layer. So when I’m talking about this layer that is the weights are going at this particular portion or between this newer on this first hidden layer.

Input layer whatever weights are how many weights I’m getting is talking about that. So how do we determine the weights? So that is nothing but you have three You have three units and you have four. Inputs so that becomes 12 So you have 3 rows and 4 columns. So, how do I write that w1w1 to W13 w14? Then I have W 2 W 2 W 2 3 w-24w 3 1 W 3 2 W 3 3 W 3 4. So at the very first layer for the very first weight that is weight at the first layer you get this Matrix. So this is nothing but your 3 & 4 3 rows and 4 columns and similarly.

We write wait at the second layer. Layer so that you can just make out from this you have to two rows and it will be three columns that is you have these many entries. So let’s do that also, so that is w1w1 to W13. W 2 W 2 W 2 3 Then you have finally the weight this layer. So between this connection this output layer and between this second hidden layer, so that is 1 cross 2. So you have only two entries so w 1 and W 1 2. That is 1 cross 2. This is 2 rows and 3 columns.

So now this becomes very handy that whenever you have different weights or save if you have n number of different weights now, this is not the usual case and when you do the production or when you just build on your network, you can have any number of different weights 50 weights or any higher number so then it becomes very difficult for you to sit and just to multiply each of this things or whenever you do the programming.

So essentially for making our life. Simpler. We just represent this in weight vectors. So now our basic goal is like with each weight Vector so, Say we have W 1 so since this is a 3 cross 4 what we basically multiply with this X1 X2 X3 and X4. So this is 3 cross 4 and this is one row and four rows and one column. So this essentially can do the matrix multiplication and what the output you get is you have Z1 at the first layer Z2 and the first layer Z3 at the first layer. So this is the linear transformation that we get now what we do is we apply the sigmoid over this Matrix.

Sigma Z 2 1 And sigma Z 3 1 What this will give you is a11 a21 and a31. Which you call it as a 1 so which we talk regarding the year that is the output from the first that is a one would be all of this. So similar is the case for a to so you perform this operation with representing in weight Vector so that their task for multiplication and when you compute this that becomes very easy, so essentially in implementation or Do the python this matrix multiplication? That is 0 x column or the weight Vector multiplied by the input Vector.

These are done with the help of operation called as Dot. So when you do implementation you use the dot product so it will essentially do the matrix multiplication. So that is W 1 1 into x 1 plus W 1 2 into x 2 plus W under into X3 W 1 4 into x 4 plus whatever you have. So basically that is your matrix multiplication now since I’ve already mentioned like whenever you do this kind of computations or whenever you build a neural network, you have the computation engine as your TensorFlow. So this is the most widely used computation engine for building any artificial neural network or for deep learning or whenever you do the learning of different baits. So essentially if you remember like why we named it as TensorFlow.

What or what is the reason behind naming this as TensorFlow now in mathematics, if you remember you have different quantities like you have scalar quantities Vector quantities and one comes as the tensor quantity, so This Matrix is essential that this row-column and column vectors and row vectors are essentially these are called as tensors. So whenever at each layer when you just multiply this and when you do flow across all these different layers and when you finally reach this that essentially is a result of matrix multiplication.

So that’s why it’s called a TensorFlow flow of different matrixes across different lengths. So well, that is all regarding the weight representation Matrix form for a neural network.