Activation Layers
Activation layers are Module wrappers around the functional activation ops, making it convenient to compose them inside Sequential or custom Module subclasses. Each layer holds no trainable parameters — it simply calls the corresponding function from simplegrad.functions.activations in its forward method.
import simplegrad as sg
import simplegrad.nn as nn
model = nn.Sequential(
nn.Linear(16, 32),
nn.ReLU(),
nn.Linear(32, 10),
)
out = model(sg.ones((4, 16)))
All activation layers inherit from Module and share its methods (.parameters(), .to_device(), .set_train_mode(), etc.).
ReLU
ReLU
| Method | Description |
|---|---|
.forward() |
Apply max(0, x) element-wise. |
ELU
ELU
Bases: Module
ELU (Exponential Linear Unit) activation layer.
Applies elu(x, alpha) element-wise. See :func:~simplegrad.functions.activations.elu
for the full definition.
Parameters:
-
alpha(float, default:1.0) –Saturation slope for the negative region. Defaults to 1.0.
Attributes
| Attribute | Type | Description |
|---|---|---|
.alpha |
float |
Scale for the negative saturation region. Defaults to 1.0. |
Methods
| Method | Description |
|---|---|
.forward() |
Apply ELU activation element-wise. |
Tanh
Tanh
| Method | Description |
|---|---|
.forward() |
Apply tanh(x) element-wise. |
Sigmoid
Sigmoid
| Method | Description |
|---|---|
.forward() |
Apply 1 / (1 + exp(-x)) element-wise. |
GELU
GELU
Bases: Module
GELU (Gaussian Error Linear Unit) activation layer.
Applies gelu(x, mode) element-wise. See :func:~simplegrad.functions.activations.gelu
for the full definition and the difference between modes.
Parameters:
-
mode(str, default:'erf') –Computation mode —
"erf"(exact, default) or"tanh"(approximation).
Attributes
| Attribute | Type | Description |
|---|---|---|
.mode |
str |
Approximation mode: "erf" (exact) or "tanh" (fast). Defaults to "erf". |
Methods
| Method | Description |
|---|---|
.forward() |
Apply GELU activation element-wise. |
Softmax
Softmax
Bases: Module
Softmax activation layer.
Parameters:
-
dim(int | None, default:None) –Dimension to normalize over. Defaults to None (all elements).
Attributes
| Attribute | Type | Description |
|---|---|---|
.dim |
int \| None |
Axis along which softmax is computed. |
Methods
| Method | Description |
|---|---|
.forward() |
Apply softmax along the configured axis. |