SGD

class dragon.optimizers.SGD(
  base_lr=0.01,
  momentum=0.9,
  **kwargs
)[source]

The optimizer to apply MomentumSGD algorithm. [Polyak, 1964].

The MomentumSGD update is defined as:

\[\text{MomentumSGD}(g) = -(\text{momentum} * m_{t-1} + \text{lr} * g) \]

__init__

SGD.__init__(
  base_lr=0.01,
  momentum=0.9,
  **kwargs
)[source]

Create a SGD updater.

Parameters:
  • base_lr (float, optional, default=0.01) – The initial value for \(\text{lr}\).
  • momentum (float, optional, default=0.9) – The initial value for \(\text{momentum}\).

Methods

apply_gradients

Optimizer.apply_gradients(
  values_and_grads,
  lr_mult=None,
  decay_mult=None
)[source]

Apply the gradients on values.

Parameters:
  • values_and_grads (Sequence[Sequence[dragon.Tensor]]) – The values and grads.
  • lr_mult (number, optional) – The multiplier to learning rate.
  • decay_mult (number, optional) – The multiplier to weight decay.