mmlearn.datasets.processors.masking.BlockwiseImagePatchMaskGenerator

class BlockwiseImagePatchMaskGenerator(input_size, num_masking_patches, min_num_patches=4, max_num_patches=None, min_aspect_ratio=0.3, max_aspect_ratio=None)[source]

Bases: object

Blockwise image patch mask generator.

This is primarily intended for the data2vec method.

Parameters:
  • input_size (Union[int, tuple[int, int]]) – The size of the input image. If an integer is provided, the image is assumed to be square.

  • num_masking_patches (int) – The number of patches to mask.

  • min_num_patches (int, default=4) – The minimum number of patches to mask.

  • max_num_patches (int, default=None) – The maximum number of patches to mask.

  • min_aspect_ratio (float, default=0.3) – The minimum aspect ratio of the patch.

  • max_aspect_ratio (float, default=None) – The maximum aspect ratio of the patch.

Methods

__call__()[source]

Generate a random mask.

Returns a random mask of shape (nb_patches, nb_patches) based on the configuration where the number of patches to be masked is num_masking_patches.

Returns:

mask – A mask of shape (nb_patches, nb_patches)

Return type:

torch.Tensor

get_shape()[source]

Get the shape of the input.

Returns:

The shape of the input as a tuple (height, width).

Return type:

tuple[int, int]