TransformerDecoderLayer

class dragon.vm.torch.nn.TransformerDecoderLayer(
  d_model,
  nhead,
  dim_feedforward=2048,
  dropout=0.1,
  activation='relu'
)[source]

Layer for a standard transformer decoder . [Vaswani et.al, 2017].

Examples:

memory = torch.ones(4, 2, 8)
tgt = torch.ones(5, 2, 8)
decoder_layer = torch.nn.TransformerDecoderLayer(d_model=8, nhead=2)
out = decoder_layer(tgt, memory)

__init__

TransformerDecoderLayer.__init__(
  d_model,
  nhead,
  dim_feedforward=2048,
  dropout=0.1,
  activation='relu'
)[source]

Create a TransformerDecoderLayer.

Parameters:
  • d_model (int) – The dimension of features.
  • nhead (int) – The number of parallel heads.
  • dim_feedforward (int, optional, default=2048) – The dimension of feedforward network.
  • dropout (float, optional, default=0.1) – The dropout ratio.
  • activation (str, optional, default='relu') – The activation function.