Skip to content

Activation Layers

Activation layers are Module wrappers around the functional activation ops, making it convenient to compose them inside Sequential or custom Module subclasses. Each layer holds no trainable parameters — it simply calls the corresponding function from simplegrad.functions.activations in its forward method.

import simplegrad as sg
import simplegrad.nn as nn

model = nn.Sequential(
    nn.Linear(16, 32),
    nn.ReLU(),
    nn.Linear(32, 10),
)
out = model(sg.ones((4, 16)))

All activation layers inherit from Module and share its methods (.parameters(), .to_device(), .set_train_mode(), etc.).


ReLU

ReLU

Bases: Module

ReLU activation layer: max(0, x).

Method Description
.forward() Apply max(0, x) element-wise.

ELU

ELU

Bases: Module

ELU (Exponential Linear Unit) activation layer.

Applies elu(x, alpha) element-wise. See :func:~simplegrad.functions.activations.elu for the full definition.

Parameters:

  • alpha (float, default: 1.0 ) –

    Saturation slope for the negative region. Defaults to 1.0.

Attributes

Attribute Type Description
.alpha float Scale for the negative saturation region. Defaults to 1.0.

Methods

Method Description
.forward() Apply ELU activation element-wise.

Tanh

Tanh

Bases: Module

Tanh activation layer: tanh(x).

Method Description
.forward() Apply tanh(x) element-wise.

Sigmoid

Sigmoid

Bases: Module

Sigmoid activation layer: 1 / (1 + exp(-x)).

Method Description
.forward() Apply 1 / (1 + exp(-x)) element-wise.

GELU

GELU

Bases: Module

GELU (Gaussian Error Linear Unit) activation layer.

Applies gelu(x, mode) element-wise. See :func:~simplegrad.functions.activations.gelu for the full definition and the difference between modes.

Parameters:

  • mode (str, default: 'erf' ) –

    Computation mode — "erf" (exact, default) or "tanh" (approximation).

Attributes

Attribute Type Description
.mode str Approximation mode: "erf" (exact) or "tanh" (fast). Defaults to "erf".

Methods

Method Description
.forward() Apply GELU activation element-wise.

Softmax

Softmax

Bases: Module

Softmax activation layer.

Parameters:

  • dim (int | None, default: None ) –

    Dimension to normalize over. Defaults to None (all elements).

Attributes

Attribute Type Description
.dim int \| None Axis along which softmax is computed.

Methods

Method Description
.forward() Apply softmax along the configured axis.