fl4health.utils.early_stopper module

class EarlyStopper(client, patience=1, interval_steps=5, snapshot_dir=None)[source]

Bases: object

__init__(client, patience=1, interval_steps=5, snapshot_dir=None)[source]

Early stopping class is a plugin for the client that allows to stop local training based on the validation loss. At each training step this class saves the best state of the client and restores it if the client is stopped. If the client starts to overfit, the early stopper will stop the training process and restore the best state of the client before sending the model to the server.

Parameters:
  • client (BasicClient) – The client to be monitored.

  • patience (int, optional) – Number of validation cycles to wait before stopping the training. If it is equal to None client never stops, but still loads the best state before sending the model to the server. Defaults to 1.

  • interval_steps (int) – Specifies the frequency, in terms of training intervals, at which the early stopping mechanism should evaluate the validation loss. Defaults to 5.

  • snapshot_dir (Path | None, optional) – Rather than keeping best state in the memory we can checkpoint it to the given directory. If it is not given, the best state is kept in the memory. Defaults to None.

add_default_snapshot_attr(name, snapshot_class, input_type)[source]
Return type:

None

delete_default_snapshot_attr(name)[source]
Return type:

None

load_snapshot(attributes=None)[source]

Load checkpointed snapshot dict consisting to the respective model attributes.

Parameters:

attributes (list[str] | None) – List of attributes to load from the checkpoint. If None, all attributes are loaded. Defaults to None.

Return type:

None

save_snapshot()[source]

Creates a snapshot of the client state and if snapshot_ckpt is given, saves it to the checkpoint.

Return type:

None

should_stop(steps)[source]

Determine if the client should stop training based on early stopping criteria.

Parameters:

steps (int) – Number of steps since the start of the training.

Returns:

True if training should stop, otherwise False.

Return type:

bool