fl4health.datasets.rxrx1.preprocess module¶
- filter_and_save_data(metadata, top_sirna_ids, cell_type, output_path)[source]¶
Filters data for the given cell type and frequency of their sirna_id and saves it to a CSV file.
- process_data(metadata, input_dir, output_dir, client_num, type_data)[source]¶
Process the entire dataset, loading image tensors for each row.
- Parameters:
metadata (pd.DataFrame) – Metadata containing information about all images.
input_dir (Path) – Input directory containing the image files.
output_dir (Path) – Output directory containing the image files.
client_num (int) – Client number to load data for.
type_data (str) – ‘train’ or ‘test’ to specify dataset type.
- Return type: