fl4health.model_bases.masked_layers.masked_normalization_layers module¶
- class MaskedBatchNorm1d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)[source]¶
Bases:
_MaskedBatchNorm
Applies (masked) Batch Normalization over a 2D or 3D input. Input shape should be (N, C) or (N, C, L), where N is the batch size, C is the number of features/channels, and L is the sequence length.
- class MaskedBatchNorm2d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)[source]¶
Bases:
_MaskedBatchNorm
Applies (masked) Batch Normalization over a 4D input (a mini-batch of 2D inputs with additional channel dimension).
- class MaskedBatchNorm3d(num_features, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True, device=None, dtype=None)[source]¶
Bases:
_MaskedBatchNorm
Applies (masked) Batch Normalization over a 5D input (a mini-batch of 3D inputs with additional channel dimension).
- class MaskedLayerNorm(normalized_shape, eps=1e-05, elementwise_affine=True, bias=True, device=None, dtype=None)[source]¶
Bases:
LayerNorm
- __init__(normalized_shape, eps=1e-05, elementwise_affine=True, bias=True, device=None, dtype=None)[source]¶
Implementation of the masked Layer Normalization module. When elementwise_affine is True, nn.LayerNorm has a learnable weight and (optional) bias. For MaskedLayerNorm, the weight and bias do not receive gradient in back propagation. Instead, two score tensors - one for the weight and another for the bias - are maintained. In the forward pass, the score tensors are transformed by the Sigmoid function into probability scores, which are then used to produce binary masks via bernoulli sampling. Finally, the binary masks are applied to the weight and the bias. During training, gradients with respect to the score tensors are computed and used to update the score tensors.
When elementwise_affine is False, nn.LayerNorm does not have weight or bias. Under this condition, both score tensors are None and MaskedLayerNorm acts in the same way as nn.LayerNorm.
Note: the scores are not assumed to be bounded between 0 and 1.
- Parameters:
normalized_shape (TorchShape) –
input shape from an expected input.
If a single integer is used, it is treated as a singleton list, and this module will normalize over the last dimension which is expected to be of that specific size.
eps (
float
) – a value added to the denominator for numerical stability. Default: 1e-5elementwise_affine (
bool
) – a boolean value that when set toTrue
, this module has learnable per-element affine parameters initialized to ones (for weights) and zeros (for biases). Default:True
.bias (
bool
) – If set toFalse
, the layer will not learn an additive bias (only relevant ifelementwise_affine
isTrue
). Default:True
.
- weight¶
the weights of the module. The values are initialized to 1.
- bias¶
the bias of the module. The values are initialized to 0.
- weight_score¶
learnable scores for the weights. Has the same shape as weight. When applied to the default initial values of self.weight (i.e., all ones), this is equivalent to randomly dropping out certain features.
- bias_score¶
learnable scores for the bias. Has the same shape as bias. When applied to the default initial values of self.bias (i.e., all zeros), it does not have any actual effect. Thus, bias_score only influences training when MaskedLayerNorm is created from some pretrained nn.LayerNorm module whose bias is not all zeros.
- forward(input)[source]¶
Define the computation performed at every call.
Should be overridden by all subclasses. :rtype:
Tensor
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.