TransformerDecoder

class dragon.vm.torch.nn.TransformerDecoder(
  d_model,
  nhead,
  num_layers,
  dim_feedforward=2048,
  dropout=0.1,
  activation='relu',
  norm=None
)[source]

Standard transformer decoder. [Vaswani et.al, 2017].

Examples:

memory = torch.ones(4, 2, 8)
tgt = torch.ones(5, 2, 8)
decoder = torch.nn.TransformerDecoder(d_model=8, nhead=2, num_layers=1)
out = decoder(tgt, memory)

__init__

TransformerDecoder.__init__(
  d_model,
  nhead,
  num_layers,
  dim_feedforward=2048,
  dropout=0.1,
  activation='relu',
  norm=None
)[source]

Create a TransformerDecoder.

Parameters:
  • d_model (int) – The dimension of features.
  • nhead (int) – The number of parallel heads.
  • num_layers (int) – The number of stack layers.
  • dim_feedforward (int, optional, default=2048) – The dimension of feedforward network.
  • dropout (float, optional, default=0.1) – The dropout ratio.
  • activation (str, optional, default='relu') – The activation function.