Module 7: CNNs & Image Pipelines
Construct Convolutional networks for image classification tasks. Compute Conv2d dimensions, pooling sizes, and linear layers.
Day 13
Convolutional Neural Networks (CNNs) in PyTorch
Why this matters
CNN Layers: CNNs exploit spatial structure with shared filters — the default for images.
Conv2d slides learnable filters across spatial dimensions — parameter sharing reduces count vs fully connected layers on images.
import torch.nn as nn
model = nn.Sequential(
nn.Conv2d(3, 16, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(16 * 16 * 16, 10),
)kernel_size,stride,paddingcontrol output spatial size.MaxPool2ddownsamples and adds translation robustness.
Common mistakes
- Forgetting optimizer.zero_grad() so gradients accumulate across batches.
- Tensor shape mismatches (especially batch/channel dimensions for CNNs).
- Training on GPU but leaving tensors on CPU (or vice versa).
Interview checkpoints
- Q: Explain cnn layers in PyTorch. A: One-sentence definition + shape/device note.
- Q: Common bug? A: Gradients, shapes, or device mismatch.
Practice
- Basic: Define CNN Layers and sketch a minimal code snippet.
- Intermediate: Run a notebook cell demonstrating CNN Layers.
- Advanced: Intentionally break CNN Layers and interpret the error.
Recap
- You can explain cnn layers clearly.
- You know one mistake to avoid.
- You see how this connects to the next lesson.
Next: Image Data
Day 14
Loading Custom Image Datasets
Why this matters
Image Data: torchvision provides datasets and transforms for reproducible image pipelines.
torchvision.datasets and transforms standardize image pipelines: resize, crop, ToTensor(), Normalize(mean, std).
Typical pipeline
Load PIL image → transforms.Compose → tensor (C, H, W) → batch (N, C, H, W) → model.
Common mistakes
- Forgetting optimizer.zero_grad() so gradients accumulate across batches.
- Tensor shape mismatches (especially batch/channel dimensions for CNNs).
- Training on GPU but leaving tensors on CPU (or vice versa).
Interview checkpoints
- Q: Explain image data in PyTorch. A: One-sentence definition + shape/device note.
- Q: Common bug? A: Gradients, shapes, or device mismatch.
Practice
- Basic: Define Image Data and sketch a minimal code snippet.
- Intermediate: Run a notebook cell demonstrating Image Data.
- Advanced: Intentionally break Image Data and interpret the error.
Recap
- You can explain image data clearly.
- You know one mistake to avoid.
- You see how this connects to the next lesson.
Next: RNN Basics
