Foundational FL Techniques

Suggest an Edit

Reading time: 1 min

In this section, FedSGD and FedAvg are detailed, both of which were first proposed in [1]. These methods fall under the category of Horizontal FL. Before detailing how each method works, let's first establish some notation that will be shared in describing both methods. First assume that there are \(N\) clients in the FL pool, each with a unique local training dataset, \(D_i\). Let

$$ D = \bigcup\limits_{k=1}^{N} D_k, $$

and denote \(\vert D \vert = n\). The end goal is to train a model parameterized by weights \(\mathbf{w}\) using all data in \(D\). Further, let \(\ell(\mathbf{w})\) be a loss function depending on \(\mathbf{w}\).

In standard FL, we aim to train a model by minimizing the loss over the dataset \(D\) of total size \(n\). This is written

$$ \begin{align*} \min_{\mathbf{w} \in \mathbf{R}^d} \ell(\mathbf{w}), \qquad \text{where} \qquad & \ell(\mathbf{w}) = \frac{1}{n} \sum_{i=1}^n \ell_i(\mathbf{w}), \end{align*} $$

and \(\ell_i(\mathbf{w})\) is the loss with respect to the \(i^{\text{th}}\) sample in the training dataset. Note that we have implicitly denoted the dimensionality of the model weights, in the equation above, as \(d\).


Contributors: