fl4health.utils.dataset_converter module¶

class AutoEncoderDatasetConverter(condition=None, do_one_hot_encoding=False, custom_converter_function=None, condition_vector_size=None)[source]¶

Bases: DatasetConverter

__init__(condition=None, do_one_hot_encoding=False, custom_converter_function=None, condition_vector_size=None)[source]¶

A dataset converter specific to formatting supervised data such as MNIST for self-supervised training in autoencoder-based models, and potentially handling the existence of additional input (i.e. condition).

This class includes three converter functions that are chosen based on the condition, other converter functions can be added or passed to support other conditions.

Parameters:

condition (str | torch.Tensor | None, optional) – Could be a fixed tensor used for all the data samples, None for non-conditional models, or a name (str) passed for other custom conversions like “label”. Defaults to None.
do_one_hot_encoding (bool, optional) – Should converter perform one hot encoding on the condition or not. Defaults to False.
custom_converter_function (Callable | None, optional) – User can define a new converter function. Defaults to None.
condition_vector_size (int | None, optional) – Size of the conditioning vector if available. Defaults to None.

convert_dataset(dataset)[source]¶

Applies the converter function over the dataset when the dataset is used (i.e. during the dataloader creation step).

Return type:: TensorDataset

get_condition_vector_size()[source]¶

Return type:: int

get_unpacking_function()[source]¶

Return type:: Callable[[Tensor], tuple[Tensor, Tensor]]

static unpack_input_condition(packed_data, cond_vec_size, data_shape)[source]¶

Unpacks model inputs (data and condition) from a tensor used in the training loop regardless of the converter function used to pack them. Unpacking relies on the size of the condition vector, and the original data shape which is saved before the packing process.

Parameters:

packed_data (torch.Tensor) – Data tensor used in the training loop as the input to the model.
cond_vec_size (int) – Size of the conditional vector that has been packed into the packed_data variable.
data_shape (torch.Size) – Expected shape of the original data tensor after unpacking the conditioning vector.

Returns:

Data in its original shape, and the condition vector to be fed into the model.

Return type:

tuple[torch.Tensor, torch.Tensor]

class DatasetConverter(converter_function, dataset)[source]¶

Bases: TensorDataset

__init__(converter_function, dataset)[source]¶

Dataset converter classes are designed to re-format any dataset for a given training task, and to fit it into the unified training scheme of supervised learning in clients. Converters can be used in the data loading step. They can also apply a light pre-processing step on datasets before the training process.

Parameters:

converter_function (Callable[[torch.Tensor, torch.Tensor], tuple[torch.Tensor, torch.Tensor]]) – Function defining how the dataset should be converted.
dataset (TensorDataset | None) – Dataset to be converted.

convert_dataset(dataset)[source]¶

Applies the converter function over the dataset when the dataset is used (i.e. during the dataloader creation step).

Return type:: TensorDataset