mmlearn.datasets.processors.masking.RandomMaskGenerator

class RandomMaskGenerator(probability)[source]

Bases: object

Random mask generator.

Returns a random mask of shape (nb_patches, nb_patches) based on the configuration where the number of patches to be masked is num_masking_patches. This is intended to be used for tasks like masked language modeling.

Parameters:

probability (float) – Probability of masking a token.

Methods

__call__(inputs, tokenizer, special_tokens_mask=None)[source]

Generate a random mask.

Returns a random mask of shape (nb_patches, nb_patches) based on the configuration where the number of patches to be masked is num_masking_patches.

Return type:

tuple[Tensor, Tensor, Tensor]

Returns:

  • inputs (torch.Tensor) – The encoded inputs.

  • tokenizer (PreTrainedTokenizer) – The tokenizer.

  • special_tokens_mask (Optional[torch.Tensor], default=None) – Mask for special tokens.