fl4health.utils.load_data module¶
- get_train_and_val_cifar10_datasets(data_dir, transform=None, target_transform=None, validation_proportion=0.2, hash_key=None)[source]¶
- Return type:
- get_train_and_val_mnist_datasets(data_dir, transform=None, target_transform=None, validation_proportion=0.2, hash_key=None)[source]¶
- Return type:
- load_cifar10_data(data_dir, batch_size, sampler=None, validation_proportion=0.2, hash_key=None)[source]¶
Load CIFAR10 Dataset (training and validation set).
- Parameters:
data_dir (Path) – The path to the CIFAR10 dataset locally. Dataset is downloaded to this location if it does not already exist.
batch_size (int) – The batch size to use for the train and validation dataloader.
sampler (LabelBasedSampler | None) – Optional sampler to subsample dataset based on labels.
validation_proportion (float) – A float between 0 and 1 specifying the proportion of samples to allocate to the validation dataset. Defaults to 0.2.
hash_key (int | None) – Optional hash key to create a reproducible split for train and validation datasets.
- Returns:
- The train data loader, validation data loader
and a dictionary with the sample counts of datasets underpinning the respective data loaders.
- Return type:
- load_cifar10_test_data(data_dir, batch_size, sampler=None)[source]¶
Load CIFAR10 Test Dataset.
- Parameters:
data_dir (Path) – The path to the CIFAR10 dataset locally. Dataset is downloaded to this location if it does not already exist.
batch_size (int) – The batch size to use for the test dataloader.
sampler (LabelBasedSampler | None) – Optional sampler to subsample dataset based on labels.
- Returns:
- The test data loader and a dictionary containing the sample count
of the test dataset.
- Return type:
- load_mnist_data(data_dir, batch_size, sampler=None, transform=None, target_transform=None, dataset_converter=None, validation_proportion=0.2, hash_key=None)[source]¶
Load MNIST Dataset (training and validation set).
- Parameters:
data_dir (Path) – The path to the MNIST dataset locally. Dataset is downloaded to this location if it does not already exist.
batch_size (int) – The batch size to use for the train and validation dataloader.
sampler (LabelBasedSampler | None) – Optional sampler to subsample dataset based on labels.
transform (Callable | None) – Optional transform to be applied to input samples.
target_transform (Callable | None) – Optional transform to be applied to targets.
dataset_converter (DatasetConverter | None) – Optional dataset converter used to convert the input and/or target of train and validation dataset.
validation_proportion (float) – A float between 0 and 1 specifying the proportion of samples to allocate to the validation dataset. Defaults to 0.2.
hash_key (int | None) – Optional hash key to create a reproducible split for train and validation datasets.
- Returns:
- The train data loader, validation data loader
and a dictionary with the sample counts of datasets underpinning the respective data loaders.
- Return type:
- load_mnist_test_data(data_dir, batch_size, sampler=None, transform=None)[source]¶
Load MNIST Test Dataset.
- Parameters:
data_dir (Path) – The path to the MNIST dataset locally. Dataset is downloaded to this location if it does not already exist.
batch_size (int) – The batch size to use for the test dataloader.
sampler (LabelBasedSampler | None) – Optional sampler to subsample dataset based on labels.
transform (Callable | None) – Optional transform to be applied to input samples.
- Returns:
- The test data loader and a dictionary containing the sample count
of the test dataset.
- Return type:
- load_msd_dataset(data_path, msd_dataset_name)[source]¶
Downloads and extracts one of the 10 Medical Segmentation Decathelon (MSD) datasets.
- Parameters:
data_path (str) – Path to the folder in which to extract the dataset. The data itself will be in a subfolder named after the dataset, not in the data_path directory itself. The name of the folder will be the name of the dataset as defined by the values of the MsdDataset enum returned by get_msd_dataset_enum
msd_dataset_name (str) – One of the 10 msd datasets
- Return type: