fl4health.strategies.fedpm module¶

class FedPm(*, fraction_fit=1.0, fraction_evaluate=1.0, min_fit_clients=2, min_evaluate_clients=2, min_available_clients=2, evaluate_fn=None, on_fit_config_fn=None, on_evaluate_config_fn=None, accept_failures=True, initial_parameters=None, fit_metrics_aggregation_fn=None, evaluate_metrics_aggregation_fn=None, weighted_eval_losses=True, bayesian_aggregation=True)[source]¶

Bases: FedAvgDynamicLayer

__init__(*, fraction_fit=1.0, fraction_evaluate=1.0, min_fit_clients=2, min_evaluate_clients=2, min_available_clients=2, evaluate_fn=None, on_fit_config_fn=None, on_evaluate_config_fn=None, accept_failures=True, initial_parameters=None, fit_metrics_aggregation_fn=None, evaluate_metrics_aggregation_fn=None, weighted_eval_losses=True, bayesian_aggregation=True)[source]¶

A strategy that is used for aggregating probability masks in the “Federated Probabilistic Mask Training” paradigm, as detailed in http://arxiv.org/pdf/2209.15328.

The implementation here allows for simply averaging the probability masks, as well as the more sophisticated Bayesian aggregation approach.

NOTE: Since the parameters aggregated by this strategy are supposed to be binary masks, by default FedPM performs uniformed averaging. The effect of weighted averaging is also not covered in the original work.

Parameters:

fraction_fit (float, optional) – Fraction of clients used during training. Defaults to 1.0. Defaults to 1.0.
fraction_evaluate (float, optional) – Fraction of clients used during validation. Defaults to 1.0.
min_fit_clients (int, optional) – Minimum number of clients used during fitting. Defaults to 2.
min_evaluate_clients (int, optional) – Minimum number of clients used during validation. Defaults to 2.
min_available_clients (int, optional) – Minimum number of clients used during validation. Defaults to 2.
evaluate_fn (Callable[[int, NDArrays, dict[str, Scalar]], tuple[float, dict[str, Scalar]] | None] | None) – Optional function used for central server-side evaluation. Defaults to None.
on_fit_config_fn (Callable[[int], dict[str, Scalar]] | None, optional) – Function used to configure training by providing a configuration dictionary. Defaults to None.
on_evaluate_config_fn (Callable[[int], dict[str, Scalar]] | None, optional) – Function used to configure client-side validation by providing a Config dictionary. Defaults to None.
accept_failures (bool, optional) – Whether or not accept rounds containing failures. Defaults to True.
initial_parameters (Parameters | None, optional) – Initial global model parameters. Defaults to None.
fit_metrics_aggregation_fn (MetricsAggregationFn | None, optional) – Metrics aggregation function. Defaults to None.
evaluate_metrics_aggregation_fn (MetricsAggregationFn | None, optional) – Metrics aggregation function. Defaults to None.
weighted_aggregation (bool, optional) – Determines whether parameter aggregation is a linearly weighted average or a uniform average. FedAvg default is weighted average by client dataset counts. Defaults to True.
weighted_eval_losses (bool, optional) – Determines whether losses during evaluation are linearly weighted averages or a uniform average. FedAvg default is weighted average of the losses by client dataset counts. Defaults to True.
bayesian_aggregation (bool) – Determines whether Bayesian aggregation is used.

aggregate(results)[source]¶

Aggregate the different layers across clients that have contributed to a layer. This aggregation may be weighted or unweighted. The called functions handle layer alignment.

Parameters:: results (list[tuple[NDArrays, int]]) – The weight results from each client’s local training that need to be aggregated on the server-side and the number of training samples held on each client. In this scheme, the clients pack the layer weights into the results object along with the weight values to allow for alignment during aggregation.
Returns:: A dictionary mapping the name of the layer that was aggregated to the aggregated weights.
Return type:: dict[str, NDArray]

aggregate_bayesian(results)[source]¶

Perform posterior update to the Beta distribution parameters based on the binary masks sent by the clients.

More precisely, each client maintains for each one of its parameter tensors a “probability score tensor”. These scores (after applying the Sigmoid function to them) are Bernoulli probabilities which indicate how likely their corresponding parameters are to be pruned or kept. Each client samples a binary mask for every one of its parameter tensors based on the corresponding Bernoulli probabilities. These masks are sent to the server for aggregation.

Here, we assume that the bernoulli probabilities of each client themselves follow a Beta distribution with parameters alpha and beta. Then the binary masks may be viewed as data that can be used to update alpha and beta, and this corresponds to a posterior update. Due to the conjugate relation between the Beta and Bernoulli distributions, the posterior distribution is still a Beta distribution, so we can perform the aggregation in this manner every round.

In this case, the updates performed are:

alpha_new = alpha + M

beta_new = beta + K * 1 - M

theta = (alpha_new - 1) / (alpha_new + beta_new - 2)

where M is the sum of all binary masks corresponding to a particular parameter tensor, K is the number of clients, and “1” in the second equation refers to an array of all ones of the same shape as M.

In the beginning, alpha and beta are initialized to arrays of all ones

Parameters:: results (list[tuple[NDArrays, int]]) – Binary masks sent to the server for aggregation
Returns:: Aggregated binary masks
Return type:: dict[str, NDArray]

reset_beta_priors()[source]¶

Reset the alpha and beta parameters for the Beta distribution to be arrays of all ones.

Return type:: None