TransformerEncoder

class dragon.vm.torch.nn.TransformerEncoder(
  d_model,
  nhead,
  num_layers,
  dim_feedforward=2048,
  dropout=0.1,
  activation='relu',
  norm=None,
  norm_first=False
)[source]

Standard transformer encoder. [Vaswani et.al, 2017].

Examples:

src = torch.ones(4, 2, 8)
encoder = torch.nn.TransformerEncoder(d_model=8, nhead=2, num_layers=1)
out = encoder(src)

__init__

TransformerEncoder.__init__(
  d_model,
  nhead,
  num_layers,
  dim_feedforward=2048,
  dropout=0.1,
  activation='relu',
  norm=None,
  norm_first=False
)[source]

Create a TransformerEncoder.

Parameters:
  • d_model (int) The dimension of features.
  • nhead (int) The number of parallel heads.
  • num_layers (int) The number of stack layers.
  • dim_feedforward (int, optional, default=2048) The dimension of feedforward network.
  • dropout (float, optional, default=0.1) The dropout ratio.
  • activation (str, optional, default='relu') The activation function.
  • norm (torch.nn.Module, optional) The norm module.
  • norm_first (bool, optional, default=False) Apply layer form before attention and feedforward.