axis=- 1,

Apply the layer normalization. [Ba, 2016]

The normalization is defined as:

\[\text{out} = \frac{x - \mathrm{E}[x]}{\sqrt{\mathrm{Var}[x] + \epsilon}} * \gamma + \beta \]

Note that the number of inputs should be 3, i.e., this operators is implemented into the fused version.

However, you can still fix the gamma and beta, by disabling the their gradients directly.

  • inputs (Sequence[dragon.Tensor]) – The tensor x, gamma and beta.
  • axis (int, optional, default=-1) – The channel axis.
  • epsilon (float, optional, default=1e-5) – The value to \(\epsilon\).

dragon.Tensor – The output tensor.