Federated Learning

Wed, Sep 4, 2024
One-minute read

Basic concept for Federal Learning.

Introduction

根據數據分佈方式和參與方之間的協作方式分為不同的類型:

Vertical FL
Horizental FL
Transfer FL
Hybrid FL?

Knowhow

FL 可以確保 Dataset 的隱私，因為期訓練不需要傳輸 Dataset ，僅需要傳輸 weight
FL 可以獲取每個 Local 的權重來合併使得模型更加 Generialization
獨立同分布 (Independent and identically distributed , IID)
當數據不是獨立同分布（非 i.i.d.）時，可能會對統計分析、機器學習模型的性能和推論結果產生影響。
1. 模型的假設被違反
2. 時間序列數據中的依賴性
3. 分類與回歸中的數據偏差
4. 過度擬合的風險
5. 推論和估計的不可靠性數據非獨立同分布會違反很多統計學和機器學習模型的基本假設，從而導致模型的性能下降或推論結果不可靠。

Learning Type

Training Time

There are three main processes in FL.

Training
Communication
Averaging

$$ \begin{aligned} total_{time} = \text{epochs} \times ( time_{training} + time_{comm} + time_{avg} ) \end{aligned} $$

Life cycle of FL

(1) Training in a distributed fashion, where raw data is kept on-devices, and each selected client locally trains a model and sends its parameter to the server
(2) Aggregation of the received models performed on the server
(3) Distribution of the new model to the clients.

Learning Processing

Training in client
Send training weight to server
Training in server
Average in server
Send avg weight and batch size to client