Adam¶
- class dragon.optimizers.Adam(
 lr=0.001,
 beta1=0.9,
 beta2=0.999,
 eps=1e-08,
 **kwargs
 )[source]¶
- The optimizer to apply Adam algorithm. [Kingma & Ba, 2014]. - The Adam update is defined as: \[\text{Adam}(g) = \text{lr} * (\frac{\text{correction}* m_{t}} {\sqrt{v_{t}} + \epsilon}) \\ \quad \\ \text{where}\quad \begin{cases} \text{correction} = \sqrt{1 - \beta_{2}^{t}} / (1 - \beta_{1}^{t}) \\ m_{t} = \beta_{1} * m_{t-1} + (1 - \beta_{1}) * g \\ v_{t} = \beta_{2} * v_{t-1} + (1 - \beta_{2}) * g^{2} \end{cases} \]
__init__¶
- Adam.- __init__(
 lr=0.001,
 beta1=0.9,
 beta2=0.999,
 eps=1e-08,
 **kwargs
 )[source]¶
- Create an - Adamupdater.- Parameters:
- lr (float, optional, default=0.001) – The initial value to \(\text{lr}\).
- beta1 (float, optional, default=0.9) – The initial value to \(\beta_{1}\).
- beta2 (float, optional, default=0.999) – The initial value to \(\beta_{2}\).
- eps (float, optional=1e-8) – The initial value to \(\epsilon\)
 
 
Methods¶
apply_gradients¶
- Optimizer.- apply_gradients(grads_and_vars)[source]
- Apply the gradients on variables. - Parameters:
- grads_and_vars (Sequence[Sequence[dragon.Tensor]]) – The sequence of update pair.
 
 
