fl4health.clients.tabular_data_client module¶
- class TabularDataClient(data_path, metrics, device, id_column, targets, loss_meter_type=LossMeterType.AVERAGE, checkpoint_and_state_module=None, reporters=None, progress_bar=False, client_name=None)[source]¶
Bases:
BasicClient- __init__(data_path, metrics, device, id_column, targets, loss_meter_type=LossMeterType.AVERAGE, checkpoint_and_state_module=None, reporters=None, progress_bar=False, client_name=None)[source]¶
Client to facilitate federated feature space alignment, specifically for tabular data, and then perform federated training.
- Parameters:
data_path (Path) – path to the data to be used to load the data for client-side training.
metrics (Sequence[Metric]) – Metrics to be computed based on the labels and predictions of the client model.
device (torch.device) – Device indicator for where to send the model, batches, labels etc. Often “cpu” or “cuda”.
id_column (str) – ID column. This is required for tabular encoding. It should be unique per row, but need not necessarily be a meaningful identifier (i.e. could be row number)
targets (str | list[str]) – The target column or columns name. This allows for multiple targets to be specified if desired.
loss_meter_type (LossMeterType, optional) – Type of meter used to track and compute the losses over each batch. Defaults to
LossMeterType.AVERAGE.checkpoint_and_state_module (ClientCheckpointAndStateModule | None, optional) – A module meant to handle both checkpointing and state saving. The module, and its underlying model and state checkpointing components will determine when and how to do checkpointing during client-side training. No checkpointing (state or model) is done if not provided. Defaults to None.
reporters (Sequence[BaseReporter] | None, optional) – A sequence of FL4Health reporters which the client should send data to. Defaults to None.
progress_bar (bool, optional) – Whether or not to display a progress bar during client training and validation. Uses
tqdm. Defaults to Falseclient_name (str | None, optional) – An optional client name that uniquely identifies a client. If not passed, a hash is randomly generated. Client state will use this as part of its state file name. Defaults to None.
- get_data_frame(config)[source]¶
User defined method that returns a pandas dataframe.
- Parameters:
config (Config) – Configuration sent by the server for customization of the function
- Return type:
DataFrame
- get_properties(config)[source]¶
Return properties of client to be sent to the server. Depending on whether the server has communicated the information to be used for feature alignment, the client will send the input/output dimensions so the server can use them to initialize the global model.
First initializes the client because this is called prior to the first federated learning round.
- preset_specific_pipeline(feature_name, pipeline)[source]¶
The user may use this method to specify a specific pipeline to be applied to a particular feature. This function stores the provided pipeline associated with the provided
feature_name.
- set_feature_specific_pipelines()[source]¶
Given the feature specific pipelines, at them to the tabular feature preprocessor.
- Return type:
- setup_client(config)[source]¶
Initialize the client by encoding the information of its tabular data and initializing the corresponding
TabularFeaturesPreprocessor.config[SOURCE_SPECIFIED]indicates whether the server has obtained the source of information to perform feature alignment. If it is True, it means the server has obtained such information (either a priori or by polling a client). So the client will encode that information and use it instead to perform feature preprocessing.- Parameters:
config (Config) – Configuration sent by the server for customization of the function
- Return type: