Import Data, Define Model – This is what we are going to learn in this article. For a 2D image like this, after flattening it into a 1D vector, we are going to feed it into a model to classify the image into one of the single-digit numbers, ranging from 0 to 9. Let’s just call this model a network for now. This simple network has only two layers. The input size is 6 times 6, which is 36 in total. The output size is 10.
10 is the number of classes we have. Each connection between the input layer and the output layer is a model parameter. In this case, we have 360 parameters. Now let’s look at how to do image classification with this model, assuming the model parameters are already optimized. The first step is to perform a linear weighted sum of the input with the model weights. The output from this step is z, which is just a 1 by 10 vector which stores the scores for each output class.
Then we can feed these scores to a Softmax function. Softmax function will normalize the input vector into the range from 0 to 1, which can now be interpreted as a probability for each class. In this case, we have the highest probability for digit 7. So the model will predict the input image as 7. This is almost all the math and concept you need to know. Let’s start the fun part.
Here, I will walk you through the code about how to import the dataset, define the model, train the model, and evaluate the model in a way you have never seen before. Let’s first have a look at the libraries we are using. Pytorch, Numpy and Sklearn, they are all very famous libraries commonly used in both research and industry.
The first step is to import the dataset. Torchvision from Pytorch allows us to import and download the MNIST dataset directly from its API. The MNIST dataset here is one of the most common datasets used for image classification. It is a dataset of handwritten digits, which contains 60 thousand training images and 10 thousand testing images. Then, we use Dataloader to load the data into an iterator. During this process, we randomly shuffle the dataset and split the dataset into batches. Shuffling the data serves the purpose of reducing variance and making sure that models remain general and overfit less.
For example, you need to shuffle your data if it is sorted by their class. Here for each batch, there are 4 images in it. This line of code will help us to get 1 batch of images. And let’s have a closer look at one of the images. The image has 28 times 28 pixels, which is 784 in total. each pixel holds a number range from 0 to 1 that represents the greyscale value of the corresponding pixels. Now it is time to define the model. This line of the code will help us to define the model structure. It is a simple two-layer network structure, The input dimension is 28 times 28, which is 784 in total.
And the output dimension is 10. There is a connection between each pair of input and output, and each connection here represents a model parameter. In the forward function, we define how the model is going to be run from input to output. First we need to reshape the 2d image into 1d, so the total number of input here is 784, which matches the network input. And the next step is to calculate the weighted sum of the input x with the model parameter w. Finally, we can return the output.
So that’s an introduction and how do we import data and define the model for our image classification task. Here, I will also provide the source code with colab notebook for this article in the description. In the next video, I’m going to talk about how to train and evaluate this model.
reference – Import Data, Define Model
Web enthusiast. Thinker. Evil coffeeaholic. Food specialist. Reader. Twitter fanatic. Music maven. AI and Machine Learning!