Linear

Linear applies an affine transformation y = x @ W + b to its input. It is the fundamental building block of fully connected networks. The weight matrix is initialised with Kaiming uniform initialisation and the bias is set to zero by default. Both are registered as learnable parameters and updated by any Optimizer.

import simplegrad as sg
import simplegrad.nn as nn

fc = nn.Linear(in_features=784, out_features=256)
x = sg.normal((32, 784))
out = fc(x)  # shape: (32, 256)
out.sum().backward()

`Linear`

Bases: Module

Fully-connected linear layer: output = x @ W + b.

Weights are initialized with Kaiming uniform (range [-1/sqrt(in), 1/sqrt(in)]).

Parameters:

in_features (int | None, default: None ) –

Number of input features.
out_features (int | None, default: None ) –

Number of output features.
weight (Tensor | None, default: None ) –

Optional pre-built weight tensor of shape (in_features, out_features).
bias (Tensor | None, default: None ) –

Optional pre-built bias tensor of shape (out_features,).
use_bias (bool, default: True ) –

Add a bias term. Defaults to True.
dtype (str, default: None ) –

Data type string. Defaults to "float32".
weight_label (str, default: 'W' ) –

Label for the weight tensor (used in graph visualization).
bias_label (str, default: 'b' ) –

Label for the bias tensor.

Attributes

Attribute	Type	Description
`.weight`	`Tensor`	Weight matrix of shape `(in_features, out_features)`. Learnable.
`.bias`	`Tensor \\| None`	Bias vector of shape `(out_features,)`. `None` if `use_bias=False`.
`.in_features`	`int`	Number of input features.
`.out_features`	`int`	Number of output features.
`.use_bias`	`bool`	Whether a bias term is included.
`.dtype`	`str`	Data type of the weight and bias tensors.

Methods

Method	Description
`.forward()`	Compute `x @ W + b`.

Inherits all methods from Module: .parameters(), .submodules(), .to_device(), .summary(), .set_train_mode(), .set_eval_mode().