How to implement neural networks in PyTorch?
This tutorial shows how to use PyTorch to create a basic neural network for classifying handwritten digits from the MNIST dataset. Neural networks, which are central to modern AI, enable machines to learn tasks like regression, classification, and generation.
With PyTorch, you'll learn how to design and train a neural network in Python to classify these handwritten numbers.
Building Neural Network using PyTorch
PyTorch offers two primary methods for building neural networks: using the nn.Module class or the nn.Sequential container.
- Using nn.Module: To create a custom network, subclass the nn.Module class and define the __init__ and forward functions. The __init__ method sets up the layers and parameters, while the forward function defines how input flows through the network and produces output.
- Using nn.Sequential: This container allows you to specify layers in a list. The layers are automatically connected in the order provided.
Steps to Implement a Neural Network in PyTorch:
1. Import Required Modules: Bring in necessary libraries like torch, torch.nn, and torch.optim.
2. Define the Network Architecture: Specify the number and types of layers, activation functions, and output size. You can either subclass torch.nn.Module for custom layers or use preset layers like torch.nn.Linear, torch.nn.Conv2d, or torch.nn.LSTM.
3. Set the Loss Function: Choose a loss function based on your task (e.g., torch.nn.MSELoss for regression, torch.nn.CrossEntropyLoss for classification).
4. Choose an Optimizer: Specify an optimizer like torch.optim.SGD, torch.optim.Adam, or torch.optim.RMSprop to adjust the network’s weights using gradients and learning rates.
5. Train the Network: Perform forward and backward passes through the data, and update the weights using the optimizer. Monitor training progress by tracking the loss and additional metrics (e.g., accuracy).
This structure simplifies building and training neural networks in Python with PyTorch.
Implementing Feedforward Neural Network for MNIST using PyTorch
Let's implement a Feedforward Neural Network (FNN) for classifying handwritten digits from the MNIST dataset using PyTorch.
Step 1: Import the Necessary Libraries
We start by importing the necessary PyTorch libraries, which include torch, torch.nn for building the model, torch.optim for the optimizer, and torchvision for dataset handling and image transformations.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
import matplotlib.pyplot as plt
import numpy as np
Step 2: Define Hyperparameters and Transformations
We set hyperparameters like batch_size
, num_epochs
, and learning_rate
for training. A transformation pipeline is applied to MNIST images: converting them to tensors and normalizing the pixel values.
batch_size = 64
num_epochs = 10
learning_rate = 0.01
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,)) # Mean and Std of MNIST dataset
])
Step 3: Load and Prepare the Dataset
The MNIST dataset is loaded for both training and testing. We use DataLoader to manage batching and shuffling of data during training.
train_dataset = datasets.MNIST(root='.', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='.', train=False, download=True, transform=transform)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
Step 4: Define the Neural Network Model
We define a simple feedforward neural network with two fully connected layers. The first layer takes the flattened image (28x28 pixels) and outputs 512 features. The second layer outputs 10 classes (digits 0-9).
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(28*28, 512)
self.fc2 = nn.Linear(512, 10)
def forward(self, x):
x = x.view(-1, 28*28) # Flatten the image
x = F.relu(self.fc1(x)) # ReLU activation
x = self.fc2(x) # Output layer
return x
Step 5: Define the Loss Function, Optimizer, and Model Instance
We use CrossEntropyLoss
as the loss function for multi-class classification. The optimizer used is Stochastic Gradient Descent (SGD).
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = Net().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=learning_rate)
Output:
Net(
(fc1): Linear(in_features=784, out_features=512, bias=True)
(fc2): Linear(in_features=512, out_features=10, bias=True)
)
Step 6: Define the Training and Test Loops
The training loop processes the batches, computes the gradients, and updates the model parameters. The test loop evaluates the model on the test dataset.
def accuracy(outputs, labels):
_, preds = torch.max(outputs, 1)
return torch.sum(preds == labels).item() / len(labels)
def train(model, device, train_loader, criterion, optimizer, epoch):
model.train()
running_loss = 0.0
running_acc = 0.0
for i, (inputs, labels) in enumerate(train_loader):
inputs, labels = inputs.to(device), labels.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
running_acc += accuracy(outputs, labels)
if (i + 1) % 200 == 0:
print(f'Epoch {epoch}, Batch {i+1}, Loss: {running_loss / 200:.4f}, Accuracy: {running_acc / 200:.4f}')
running_loss = 0.0
running_acc = 0.0
def test(model, device, test_loader, criterion):
model.eval()
test_loss = 0.0
test_acc = 0.0
with torch.no_grad():
for inputs, labels in test_loader:
inputs, labels = inputs.to(device), labels.to(device)
outputs = model(inputs)
loss = criterion(outputs, labels)
test_loss += loss.item()
test_acc += accuracy(outputs, labels)
print(f'Test Loss: {test_loss / len(test_loader):.4f}, Test Accuracy: {test_acc / len(test_loader):.4f}')
Step 7: Train, Test, and Visualize Results
The model is trained for a set number of epochs and then tested. Additionally, a few sample predictions are visualized using matplotlib.
for epoch in range(1, num_epochs + 1):
train(model, device, train_loader, criterion, optimizer, epoch)
test(model, device, test_loader, criterion)
# Visualize sample images with predictions
samples, labels = next(iter(test_loader))
samples = samples.to(device)
outputs = model(samples)
_, preds = torch.max(outputs, 1)
samples = samples.cpu().numpy()
fig, axes = plt.subplots(3, 3, figsize=(8, 8))
for i, ax in enumerate(axes.ravel()):
ax.imshow(samples[i].squeeze(), cmap='gray')
ax.set_title(f'Label: {labels[i]}, Prediction: {preds[i]}')
ax.axis('off')
plt.tight_layout()
plt.show()
Output:
Epoch 1, Batch 200, Loss: 1.1144, Accuracy: 0.7486
Epoch 1, Batch 400, Loss: 0.4952, Accuracy: 0.8739
Epoch 1, Batch 600, Loss: 0.3917, Accuracy: 0.8903
Epoch 1, Batch 800, Loss: 0.3515, Accuracy: 0.9042
Test Loss: 0.3018, Test Accuracy: 0.9155
. . .
Epoch 10, Batch 200, Loss: 0.1112, Accuracy: 0.9679
Epoch 10, Batch 400, Loss: 0.1120, Accuracy: 0.9707
Epoch 10, Batch 600, Loss: 0.1158, Accuracy: 0.9681
Epoch 10, Batch 800, Loss: 0.1138, Accuracy: 0.9688
Test Loss: 0.1145, Test Accuracy: 0.9665
The output contains loss and accuracy at regular intervals along with the test accuracy. The final part of the code will show some test images along with their true labels and predicted labels.