When building neural networks, one concept that often confuses beginners is the idea of a trainable layer. A trainable layer is any layer that contains parameters that are updated during training using backpropagation.
A layer is trainable if it contains parameters such as weights or biases.
During training the following steps happen:
Only layers with parameters can be updated. These are called trainable layers.
Ask this question:
Does this layer have weights or bias?
| Layer | Why it is Trainable |
|---|---|
| nn.Linear | Contains weight and bias |
| nn.Conv2d | Contains convolution kernels |
| nn.BatchNorm | Contains learnable scale and shift |
| nn.Embedding | Contains embedding matrix |
Example:
self.fc = nn.Linear(10,5)
This layer internally contains:
weight : (5,10) bias : (5)
These parameters are updated during training.
Some layers perform operations but do not learn parameters.
| Layer | Reason |
|---|---|
| ReLU | Applies max(0,x) |
| Sigmoid | Mathematical transformation |
| Dropout | Random masking |
| MaxPool | Selects maximum value |
| Flatten | Changes tensor shape |
Example:
x = torch.relu(x)
ReLU simply applies the function max(0,x).
It contains no parameters and therefore is not trainable.
All trainable layers must be defined inside __init__ and referenced using self.
class Model(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(2,4)
self.fc2 = nn.Linear(4,1)
def forward(self,x):
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
PyTorch discovers parameters through:
model.parameters()
Defining trainable layers inside the forward function:
def forward(self,x):
fc = nn.Linear(2,4)
return fc(x)
This is incorrect because:
Think of a neural network as a factory:
Training adjusts the knobs to reduce prediction error.
Trainable layers are the parts of a neural network that contain parameters and get updated during training.
If a component has no parameters, it is not trainable — even if it is called a layer.