SGD

class dragon.vm.tensorflow.keras.optimizers.SGD(
  learning_rate=0.01,
  momentum=0.0,
  nesterov=False,
  name=None,
  **kwargs
)[source]

The optimizer to apply SGD algorithm.

Following SGD algorithms are supported:

VanillaSGD, whose update is defined as:

\[\text{VanillaSGD}(g) = -\text{lr} * g \]

MomentumSGD [Polyak, 1964], whose update is defined as:

\[\text{MomentumSGD}(g) = -(\text{momentum} * m_{t-1} + \text{lr} * g) \]

NesterovSGD [Sutskever et.al, 2013], whose update is defined as:

\[\text{NesterovSGD}(g) = -((1 + \text{momentum}) * m_{t} - \text{momentum} * m_{t-1}) \\ \quad \\ \text{where} \quad m_{t} = \text{momentum} * m_{t-1} + \text{lr} * g \]

You can use one of them by setting the defaults:

# Set the ``lr`` only
vanilla_sgd = tf.keras.optimizers.SGD(learning_rate=0.1)

# Set the ``lr`` and ``momentum``
momentum_sgd = tf.keras.optimizers.SGD(learning_rate=0.1, momentum=0.9)

# Set the ``lr``, ``momentum`` and ``nesterov``
nesterov_sgd = tf.keras.optimizers.SGD(lr=0.1, momentum=0.9, nesterov=True)

__init__

SGD.__init__(
  learning_rate=0.01,
  momentum=0.0,
  nesterov=False,
  name=None,
  **kwargs
)[source]

Create a SGD optimizer.

Parameters:
  • learning_rate (float, optional, default=0.01) – The initial value for \(\text{lr}\).
  • momentum (float, optional, default=0) – The initial value for \(\text{momentum}\).
  • nesterov (bool, optional, default=False) – True to switch to NesterovSGD optimizer.
  • name (str, optional) – The optional optimizer name.

Properties

iterations

Optimizer.iterations

Return the number of steps has run.

Returns:
int – The iterations.

Methods

apply_gradients

Optimizer.apply_gradients(grads_and_vars)[source]

Apply the gradients to update variables.

Parameters:
  • grads_and_vars (Sequence[Sequence[dragon.Tensor]]) – The gradients and variables.
Returns:

dragon.vm.tensorflow.keras.optimizers.Optimizer – The self to generate the update operations.