Skip to content

.step()

Apply one Adam update step to all parameters across all groups.

Uses the lr, beta_1, beta_2, eps, and maximize stored in each parameter group, so different groups may use different hyperparameters.

Raises:

  • ValueError

    If any parameter gradient is None (forgot to call backward).