Linear
Linear applies an affine transformation y = x @ W + b to its input. It is the fundamental building block of fully connected networks. The weight matrix is initialised with Kaiming uniform initialisation and the bias is set to zero by default. Both are registered as learnable parameters and updated by any Optimizer.
import simplegrad as sg
import simplegrad.nn as nn
fc = nn.Linear(in_features=784, out_features=256)
x = sg.normal((32, 784))
out = fc(x) # shape: (32, 256)
out.sum().backward()
Linear
Bases: Module
Fully-connected linear layer: output = x @ W + b.
Weights are initialized with Kaiming uniform (range [-1/sqrt(in), 1/sqrt(in)]).
Parameters:
-
in_features(int | None, default:None) –Number of input features.
-
out_features(int | None, default:None) –Number of output features.
-
weight(Tensor | None, default:None) –Optional pre-built weight tensor of shape
(in_features, out_features). -
bias(Tensor | None, default:None) –Optional pre-built bias tensor of shape
(out_features,). -
use_bias(bool, default:True) –Add a bias term. Defaults to True.
-
dtype(str, default:None) –Data type string. Defaults to
"float32". -
weight_label(str, default:'W') –Label for the weight tensor (used in graph visualization).
-
bias_label(str, default:'b') –Label for the bias tensor.
Attributes
| Attribute | Type | Description |
|---|---|---|
.weight |
Tensor |
Weight matrix of shape (in_features, out_features). Learnable. |
.bias |
Tensor \| None |
Bias vector of shape (out_features,). None if use_bias=False. |
.in_features |
int |
Number of input features. |
.out_features |
int |
Number of output features. |
.use_bias |
bool |
Whether a bias term is included. |
.dtype |
str |
Data type of the weight and bias tensors. |
Methods
| Method | Description |
|---|---|
.forward() |
Compute x @ W + b. |
Inherits all methods from Module: .parameters(), .submodules(), .to_device(), .summary(), .set_train_mode(), .set_eval_mode().