mmlearn.datasets.core.example.Example

class Example(init_dict=None)[source]

Bases: OrderedDict[Any, Any]

A representation of a single example from a dataset.

This class is a subclass of OrderedDict and provides attribute-style access. This means that example[“text”] and example.text are equivalent. All datasets in this library return examples as Example objects.

Parameters:

init_dict (Optional[MutableMapping[Hashable, Any]], optional, default=None) – Dictionary to init Example class with.

Examples

>>> example = Example({"text": torch.tensor(2)})
>>> example.text.zero_()
tensor(0)
>>> example.context = torch.tensor(4)  # set custom attributes after initialization

Methods

create_ids()[source]

Create a unique id for the example from the dataset and example index.

This method combines the dataset index and example index to create an attribute called example_ids, which is a dictionary of tensors. The dictionary keys are all the keys in the example except for example_ids, example_index, and dataset_index. The values are tensors of shape (2,) containing the tuple (dataset_index, example_index) for each key. The example_ids is used to (re-)identify pairs of examples from different modalities after they have been combined into a batch.

Warns:

UserWarning – If the example_index and dataset_index attributes are not set.

Return type:

None

Notes

  • The Example must have the following attributes set before calling this this method: example_index (usually set/returned by the dataset) and dataset_index (usually set by the CombinedDataset object)

  • The find_matching_indices() function can be used to find matching examples given two tensors of example ids.