fl4health.utils.dataset_converter module¶
- class AutoEncoderDatasetConverter(condition=None, do_one_hot_encoding=False, custom_converter_function=None, condition_vector_size=None)[source]¶
Bases:
DatasetConverter
- __init__(condition=None, do_one_hot_encoding=False, custom_converter_function=None, condition_vector_size=None)[source]¶
A dataset converter specific to formatting supervised data such as MNIST for self-supervised training in autoencoder-based models, and potentially handling the existence of additional input i.e. condition. This class includes three converter functions that are chosen based on the condition, other converter functions can be added or passed to support other conditions.
- Parameters:
condition (str | torch.Tensor | None) – Could be a fixed tensor used for all the data samples, None for non-conditional models, or a name(str) passed for other custom conversions like ‘label’.
do_one_hot_encoding (bool, optional) – Should converter perform one hot encoding on the condition or not.
custom_converter_function (Callable | None, optional) – User can define a new converter function.
- convert_dataset(dataset)[source]¶
Applies the converter function over the dataset when the dataset is used (i.e. during the dataloader creation step).
- Return type:
- static unpack_input_condition(packed_data, cond_vec_size, data_shape)[source]¶
Unpacks model inputs (data and condition) from a tensor used in the training loop regardless of the converter function used to pack them. Unpacking relies on the size of the condition vector, and the original data shape which is saved before the packing process.
- Parameters:
packed_data (torch.Tensor) – Data tensor used in the training loop as the input to the model.
- Returns:
Data in its original shape, and the condition vector to be fed into the model.
- Return type:
tuple[torch.Tensor, torch.Tensor]
- class DatasetConverter(converter_function, dataset)[source]¶
Bases:
TensorDataset
- __init__(converter_function, dataset)[source]¶
Dataset converter classes are designed to re-format any dataset for a given training task, and to fit it into the unified training scheme of supervised learning in clients. Converters can be used in the data loading step. They can also apply a light pre-processing step on datasets before the training process.