Search topics…
Tutorials
Explore
June 6 Offline Event →
Module 3 · PyTorch Deep Learning

Module 3: torch.nn & Training Pipeline

Build deep learning networks with torch.nn module, defining activation functions, forward pathways, loss criterions, and loops.

⏱ 22 Min Read Author: GenAIWallah Team Updated: May 2026
Day 5

Creating Models using torch.nn

Why this matters

torch.nn: nn.Module organizes layers with parameters; forward() defines the computation.

Subclass nn.Module or stack layers with nn.Sequential. Learnable weights are nn.Parameter objects registered automatically.

import torch.nn as nn

layer = nn.Linear(784, 10)  # in_features, out_features
x = torch.randn(32, 784)
out = layer(x)  # shape (32, 10)
  • nn.Linear(in, out) — fully connected layer.
  • nn.ReLU(), nn.Sigmoid() — activations.
  • model.parameters() — iterator for optimizers.

Common mistakes

  • Forgetting optimizer.zero_grad() so gradients accumulate across batches.
  • Tensor shape mismatches (especially batch/channel dimensions for CNNs).
  • Training on GPU but leaving tensors on CPU (or vice versa).

Interview checkpoints

  • Q: Explain torch.nn in PyTorch. A: One-sentence definition + shape/device note.
  • Q: Common bug? A: Gradients, shapes, or device mismatch.

Practice

  1. Basic: Define torch.nn and sketch a minimal code snippet.
  2. Intermediate: Run a notebook cell demonstrating torch.nn.
  3. Advanced: Intentionally break torch.nn and interpret the error.

Recap

  • You can explain torch.nn clearly.
  • You know one mistake to avoid.
  • You see how this connects to the next lesson.

Next: Training Loop

Day 6

The Core Training Loop Lifecycle

Why this matters

Training Loop: The training loop ties together data, forward, loss, backward, and optimizer.step() — the core DL ritual.

The standard training loop repeats: fetch batch → forward → loss → backward → step.

import torch.nn as nn
import torch.optim as optim

model = nn.Sequential(nn.Linear(10, 1))
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

for epoch in range(10):
    optimizer.zero_grad()
    preds = model(X_batch)
    loss = criterion(preds, y_batch)
    loss.backward()
    optimizer.step()

Always call optimizer.zero_grad() (or zero_grad(set_to_none=True)) before backward() to clear old gradients.

Common mistakes

  • Forgetting optimizer.zero_grad() so gradients accumulate across batches.
  • Tensor shape mismatches (especially batch/channel dimensions for CNNs).
  • Training on GPU but leaving tensors on CPU (or vice versa).

Interview checkpoints

  • Q: Explain training loop in PyTorch. A: One-sentence definition + shape/device note.
  • Q: Common bug? A: Gradients, shapes, or device mismatch.

Practice

  1. Basic: Define Training Loop and sketch a minimal code snippet.
  2. Intermediate: Run a notebook cell demonstrating Training Loop.
  3. Advanced: Intentionally break Training Loop and interpret the error.

Recap

  • You can explain training loop clearly.
  • You know one mistake to avoid.
  • You see how this connects to the next lesson.

Next: Dataset

← Module 2: Autograd Module 4: Dataset & DataLoader →