Loss Functions

Loss functions measure the discrepancy between model predictions and target labels. ce_loss (cross-entropy) is the standard choice for classification — it expects raw logits and integer class indices. mse_loss (mean squared error) is the go-to for regression tasks. Both return a scalar Tensor whose .backward() triggers the gradient computation for the whole network.

import simplegrad as sg

logits = sg.normal((4, 10), requires_grad=True)
targets = sg.Tensor([2, 7, 0, 5])
loss = sg.ce_loss(logits, targets)
loss.backward()

ce_loss

Cross-entropy loss over raw logits. A softmax is applied internally, so do not pass pre-softmaxed probabilities.

\[ \mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N} \log\frac{e^{x_{i,y_i}}}{\sum_j e^{x_{i,j}}} \]

`ce_loss(z: Tensor, y: Tensor, dim: int = -1, reduction: str = 'mean') -> Tensor`

Compute cross-entropy loss with built-in softmax.

Numerically stable: uses the log-sum-exp trick internally.

Parameters:

z (Tensor) –

Logits (raw unnormalized scores), shape (..., num_classes).
y (Tensor) –

Target probability distribution, same shape as z.
dim (int, default: -1 ) –

Class dimension to apply softmax over. Defaults to -1 (last dim).
reduction (str, default: 'mean' ) –

How to reduce the per-sample losses. One of "mean", "sum", or None (return per-sample losses).

Raises:

ValueError –

If reduction is not a valid option.

mse_loss

\[ \mathcal{L} = \frac{1}{N}\sum_{i=1}^{N}(y_i - \hat{y}_i)^2 \]

`mse_loss(p: Tensor, y: Tensor, reduction: str = 'mean') -> Tensor`

Compute mean squared error loss: mean((p - y)^2).

Parameters:

p (Tensor) –

Predictions tensor.
y (Tensor) –

Targets tensor, same shape as p.
reduction (str, default: 'mean' ) –

One of "mean", "sum", or None.

Raises:

ValueError –

If reduction is not a valid option.