Training and Testing a Basic Neural Network using Pytorch
In the Last Article I explained how neural nets works, but how do you take that math and convert it to code. Well I am here to tell you that.
Well you can go ahead and create a neural network from absolute scratch but I guess we should save it for some other time.
Thankfully, Google and FaceBook have blessed us with 2 of the most popular neural nets library TensorFlow and Pytorch that make the job quite easy. For this article I’ll using Pytorch and I’ll use Tensorflow in the next one.
Well then, I guess lets create your first NN. By the way I’ll be using MNIST dataset since its the tradition to use it when you create your first NN. Its something like the ‘Hello World’ of Neural Nets or maybe thats just me :P
Importing Libraries and Datasets
So like any program we have to import our libraries first. Now what these do we’ll see next.
So now before defining our model we should first import the data it is going to train on. We are training on MNIST and thankfully pytorch has already provided that to us :)
Now since a lot is going on in the above code lets break it down.
First I am assigning a variable is_gpu the value returned by torch.cuda.is_available(). What is returns is a boolean that tells if the GPU can be used by Pytorch or not.
Next I am defining a variable transform that tells how to process the dataset thats being downloaded. Here we are saying that convert the dataset to tensors. Tensor is a fancy way to say multi dimentional matrix.
Next I am downloading the dataset and passing the arguments that tells it to download the data and apply transform on it and where that data is training data or not is told by the argument train.
Next I am creating DataLoader, now what this beautiful utility is used to organise the downloader dataset. So it basically returns an iterator of our dataset which iterates over the batches of provided by batch_size=32. Whats shuffle = True does is that it tells to shuffle the data before creating batches its usually used to remove any ordering in dataset so its not really required but its nice if you get the hang of it.
Defining Our Neural Network
Now before the main event we have to define the main character, the highlight of the show that is our neural network. Now there are 2 ways to create Neural Networks in Pytorch: Class Way and Sequential Way. I thought I should save the class way for the next article since it requires a bit more explaination and also we are making a simple neural network so it can be done simply using Sequential().
So first we imported the nn library that contains the layers, activation functions and other stuff. Next in Sequential we simply defined the following model:
Input Layer: 28*28 nodes(corresponding to 28*28 dimension of MNIST image)
Hidden Layer 1: 512 nodes
Hidden Layer 2: 256 nodes
Output Layer: 10 nodes(corresponding to 10 outputs of MNIST dataset)
Now as you saw we have added Layer called Flatten(). What its does is that it takes the image and converts it to a 1-D tensor or flatten it.
Next we used Linear() and told it to apply linear transformation to the incoming data. We defining the no. of incoming features and no. of outgoing features. Think of it as saying take this 28*28 sized tensor and output me a 512 sized tensor.
Next we used ReLU() to apply the activation function on the output of the previous layer. We repeated this to make Hidden Layer 2 and Output Layer.
Optimizer and Loss Function
Now we used CrossEntropyLoss() as our loss function to be minimized and SGD() i.e. Stochiastic Gradient Descent as our optimizer. Optimizers basically update the weight parameters to minimize the loss function.
Now there are many loss functions and optimizers you can choose from but for this we are using these.
Training Our Neural Network
Now in a typical pytorch train loop you do the following:-
1. Clear residual gradients.
2. Make a Forward Pass and get the output.
3. Calculate the loss and make a backward pass to caculate gradients.
4. Run the optimizer to update the weight.
Now as you can see in the code we are first defining epochs and loop of n epochs to train. Then we are iterating over trainloader to pass the input data to NN and training it.
In the trainloader loop, First we are moving our data to GPU and then removing any gradient so that it doesn’t effect our training. Next we are making a forward pass and obtain the outputs. Next we pass these output along with correct label to obtain our loss and then call backward() to calculate gradients. Then we call step() to update the weights. And add the loss to train_loss.
Testing Our Neural Network
Now Training is good and all but what about testing. Above is the code for Testing loop its pretty much the same thing as training except here we are also calculating correctly predicted values,total values and not preforming the optimization step.
Congratulations!!! We just created and trained our first NN in pytorch.
Now there are things you can do to improve it like Hyperparameter Optimization using Ray Tune or Normalization in transform. So don’t stop here. Keep learning, Keep growing :)