SGD
simplegrad.optimizers.sgd.SGD
Bases: Optimizer
Stochastic gradient descent with optional momentum.
Update rule (with momentum)::
v_t = momentum * v_{t-1} - lr * (1 - dampening) * grad
param += v_t
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
The model whose parameters to optimize. |
required |
lr
|
float
|
Learning rate. Defaults to 0.01. |
0.01
|
momentum
|
float
|
Momentum factor. 0 disables momentum. Defaults to 0. |
0
|
dampening
|
float
|
Dampening applied to the gradient. Defaults to 0. |
0
|
Raises:
| Type | Description |
|---|---|
TypeError
|
If |
Source code in simplegrad/optimizers/sgd.py
step()
Apply one SGD update step to all model parameters.
Raises:
| Type | Description |
|---|---|
ValueError
|
If any parameter gradient is None (forgot to call backward). |