Skip to content

Linear

Linear applies an affine transformation y = x @ W + b to its input. It is the fundamental building block of fully connected networks. The weight matrix is initialised with Kaiming uniform initialisation and the bias is set to zero by default. Both are registered as learnable parameters and updated by any Optimizer.

import simplegrad as sg
import simplegrad.nn as nn

fc = nn.Linear(in_features=784, out_features=256)
x = sg.normal((32, 784))
out = fc(x)  # shape: (32, 256)
out.sum().backward()

Linear

Bases: Module

Fully-connected linear layer: output = x @ W + b.

Weights are initialized with Kaiming uniform (range [-1/sqrt(in), 1/sqrt(in)]).

Parameters:

  • in_features (int | None, default: None ) –

    Number of input features.

  • out_features (int | None, default: None ) –

    Number of output features.

  • weight (Tensor | None, default: None ) –

    Optional pre-built weight tensor of shape (in_features, out_features).

  • bias (Tensor | None, default: None ) –

    Optional pre-built bias tensor of shape (out_features,).

  • use_bias (bool, default: True ) –

    Add a bias term. Defaults to True.

  • dtype (str, default: None ) –

    Data type string. Defaults to "float32".

  • weight_label (str, default: 'W' ) –

    Label for the weight tensor (used in graph visualization).

  • bias_label (str, default: 'b' ) –

    Label for the bias tensor.

Attributes

Attribute Type Description
.weight Tensor Weight matrix of shape (in_features, out_features). Learnable.
.bias Tensor \| None Bias vector of shape (out_features,). None if use_bias=False.
.in_features int Number of input features.
.out_features int Number of output features.
.use_bias bool Whether a bias term is included.
.dtype str Data type of the weight and bias tensors.

Methods

Method Description
.forward() Compute x @ W + b.

Inherits all methods from Module: .parameters(), .submodules(), .to_device(), .summary(), .set_train_mode(), .set_eval_mode().