Federated Learning is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them. This approach stands in contrast to traditional centralized machine learning techniques where all the local datasets are uploaded to one server, as well as to more classical decentralized approaches which often assume that local data samples are identically distributed. Federated learning enables multiple actors to build a common, robust machine learning model without sharing data, thus addressing critical issues such as data privacy, data security, data access rights, and access to heterogeneous data.
Introduced by Google researchers in 2016 (McMahan et al.) in the paper 'Communication-Efficient Learning of Deep Networks from Decentralized Data', initially designed for mobile devices (like predicting the next word on a keyboard).
It is now widely used in applications where data privacy is paramount, such as healthcare (training models on patient data from multiple hospitals without sharing records), finance (detecting fraud across banks), and smart devices (improving predictive text and voice recognition on phones).