mmlearn.modules.layers.transformer_block module¶

Transformer Block and Embedding Modules for Vision Transformers (ViT).

class Block(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]¶

Bases: Module

Transformer Block.

This module represents a Transformer block that includes self-attention, normalization layers, and a feedforward multi-layer perceptron (MLP) network.

Parameters:

dim (int) – The input and output dimension of the block.
num_heads (int) – Number of attention heads.
mlp_ratio (float, optional, default=4.0) – Ratio of hidden dimension to the input dimension in the MLP.
qkv_bias (bool, optional, default=False) – If True, add a learnable bias to the query, key, value projections.
qk_scale (Optional[float], optional, default=None) – Override default qk scale of head_dim ** -0.5 if set.
drop (float, optional, default=0.0) – Dropout probability for the output of attention and MLP layers.
attn_drop (float, optional, default=0.0) – Dropout probability for the attention scores.
drop_path (float, optional, default=0.0) – Stochastic depth rate, a form of layer dropout.
act_layer (Callable[..., torch.nn.Module], optional, default=nn.GELU) – Activation layer in the MLP.
norm_layer (Callable[..., torch.nn.Module], optional, default=torch.nn.LayerNorm) – Normalization layer.

forward(x, return_attention=False)[source]¶

Forward pass through the Transformer Block.

Return type:: Tensor

class DropPath(drop_prob=0.0)[source]¶

Bases: Module

Drop paths (Stochastic Depth) per sample.

Parameters:: drop_prob (float, optional, default=0.0) – Probability of dropping paths.

forward(x)[source]¶

Forward pass through DropPath module.

Return type:: Tensor

drop_path(x, drop_prob=0.0, training=False)[source]¶

Drop paths (Stochastic Depth) for regularization during training.

Parameters:

x (torch.Tensor) – Input tensor.
drop_prob (float, optional, default=0.0) – Probability of dropping paths.
training (bool, optional, default=False) – Whether the model is in training mode.

Returns:

output – Output tensor after applying drop path.

Return type:

torch.Tensor