IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML...

67
IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning Wojciech Samek (Fraunhofer HHI) 1. Introduction (WS) 2. Model Compression & Efficient Deep Learning (WS) 3. (Virtual) Coffee Break 4. Distributed & Federated Learning (FS) Felix Sattler (Fraunhofer HHI) Outline of tutorial

Transcript of IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML...

Page 1: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Wojciech Samek

(Fraunhofer HHI)

1. Introduction (WS)

2. Model Compression & Efficient Deep Learning (WS)

3. (Virtual) Coffee Break

4. Distributed & Federated Learning (FS)

Felix Sattler(Fraunhofer HHI)

Outline of tutorial

Page 2: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Part III: Distributed & Federated LearningWojciech Samek & Felix Sattler

Page 3: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Distributed Learning

Data Data Data

Server

Data

Data

Data

Page 4: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Traditional Centralized (“Cloud”) ML

Data

Data

Data

Server

→ Data is gathered Centrally Problems

– Privacy

– Ownership (→ who owns the data?)

– Security (→ single point of failure)

– Efficiency (→ need to move data around)

train model

Page 5: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Distributed (/ “Embedded”) ML

Data

Data

Data

Server

train model

Problems

– Privacy

– Ownership (→ who owns the data?)

– Security (→ single point of failure)

– Efficiency (→ need to move data around)

train model

train model

→ Data never leaves the local Devices

→ Instead model Updates are communicated

Page 6: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Distributed (/ “Embedded”) ML: Settings

Detailed Comparison: Sattler, Wiegand, Samek. "Trends and Advancements in Deep Neural Network Communication."

Page 7: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

“Federated Learning is a machine learning setting where multiple entities collaborate in solving a learning problem, without directly exchanging data. The

Federated training process is coordinated by a central server.”

Kairouz, Peter, et al. "Advances and open problems in federated learning."

Page 8: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

Page 9: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

Data Data Data

Server

Data

Data

Data

Page 10: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

Data Data Data

Server

SGDSGD SGD

Page 11: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

Data Data Data

Serveraveraging

Page 12: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

Data Data Data

Server

Data

Data

Data

Page 13: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Settings

Cross Device

– Large Number of Clients

– Only fraction of Clients available at any given time

– Few data points per Client

– Limited computational resources

Cross Silo

– Small number of Clients

– Clients are always available

– Large local data sets

– Strong computational resources

Page 14: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Challenges

Challenges in Federated Learning

Convergence

Heterogeneity

Privacy

Robustness Personalization

Communication

Page 15: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Communication

Data Data Data

Server Server

Data Data Data

Download Upload

Page 16: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Communication

Total Communication = [#Communication Rounds] x [#Parameters] x [Avg. Codeword length]

Case Study: VGG16 on ImageNet

– Number of Iterations until Convergence: 900.000

– Number of Parameters: 138.000.000

– Bits per Parameter: 32

→ Total Communication = 496.8 Terabyte (Upload+Download)

Page 17: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Compression Methods

Total Communication = [#Communication Rounds] x [#Parameters] x [Avg. Codeword length]

Compression Methods

– Communication Delay

– Lossy Compression: Unbiased

– Lossy Compression: Biased

– Efficient Encoding

Page 18: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Communication Delay

Distributed SGD:

For t=1,..,[Communication Rounds]:

For i=1,..,[Participating Clients]:

Client does:

Server does:

Federated Averaging:

For t=1,..,[Communication Rounds]:

For i=1,..,[Participating Clients]:

Client does:

Server does:

Page 19: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Communication Delay

Advantages:

● Simple

● Reduces Communication Frequency (advantageous in on-device FL)

● Reduces both Upstream and Downstream communication

● Easy to integrate with Privacy mechanisms

Disadvantages:

Page 20: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Communication Delay

Convergence Analysis for Convex Objectives:

IID-Assumption:

– Clients Participating per Round

– Total communication Rounds

– Local Iterations per Round

– Lipschitz Parameter of the Loss Function

– Bound on the Variance of the Stochastic Gradients

Lan, Guanghui. "An optimal method for stochastic composite optimization."

Page 21: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Statistical Heterogeneity

Convergence speed drastically decreases with increasing heterogeneity in the data

→ This effect aggravates if the number of participating clients (“reporting fraction”) is low

Hsu, Tzu-Ming Harry, Hang Qi, and Matthew Brown. "Measuring the effects of non-identical data distribution for federated visual classification."

Dirichlet Parameter

Page 22: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Communication Delay

Kairouz, Peter, et al. "Advances and open problems in federated learning."

Page 23: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Communication Delay

Advantages:

● Simple

● Reduces Communication Frequency (more practical in on-device FL)

● Reduces Upstream + Downstream communication

● Easy to integrate with Privacy mechanisms

Disadvantages:

● Bad performance on non-iid data

● Low sample efficiency

Page 24: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Compression Methods

Total Communication = [#Communication Rounds] x [#Parameters] x [Avg. Codeword length]

Compression Methods

– Communication Delay

– Lossy Compression: Unbiased

– Lossy Compression: Biased

– Efficient Encoding

Page 25: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Update Compression

Distributed SGD:

For t=1,..,[Communication Rounds]:

For i=1,..,[Participating Clients]:

Client does:

Server does:

Distributed SGD with Compression:

For t=1,..,[Communication Rounds]:

For i=1,..,[Participating Clients]:

Client does:

Server does:

Page 26: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Update Compression

------ Unbiased ------ ----------------------- Biased -----------------------

Page 27: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Unbiased Compression

Definition: A compression operator is called unbiased iff,

Pros:

➔ “Straight forward” Convergence Analysis (Stochastic Gradients with increased variance)

➔ Variance reduction (uncorrelated noise)

variance of the gradient estimator

Strongly convex bound:

Page 28: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Unbiased Compression

● Pros: “Straight forward” Convergence Analysis (Stochastic Gradients, increased variance)

● Cons: Variance blow-up leads to poor empirical performance

Page 29: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Biased Compression

Definition: A compression operator is called biased iff,

→ Can be turned into convergent methods via error accumulation.

Karimireddy, et al. "Error feedback fixes signsgd and other gradient compression schemes."

Stich, Cordonnier, Jaggi. "Sparsified SGD with memory."

Biased compression methods do not necessarily converge!

Page 30: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Error AccumulationDistributed SGD with

Error Accumulation:

For t=1,..,[Communication Rounds]:

For i=1,..,[Participating Clients]:

Client does:

Server does:

Distributed SGD:

For t=1,..,[Communication Rounds]:

For i=1,..,[Participating Clients]:

Client does:

Server does:

Page 31: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Error Accumulation

For a parameter , a - contraction operator is a (possibly randomized) operator that satisfies the contraction property

Theorem (Stich et al.): For any contraction operator, compressed SGD with Error Accumulation for large T achieves convergence rate on mu-strongly convex objective functions with asymptotic rate:

Independent of alpha!

Page 32: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Recap Compression

Unbiased Biased

MethodsTernGrad, QSGD, Atomo Gradient Dropping, Deep

Gradient Compression, signSGD, PowerSGD,

Convergence Proofs

Bounded Variance Assumption

k-contraction Framework (Stich et al. 2018)

Page 33: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Combining Methods: Sparse Binary Compression

Sattler, et al. "Sparse binary compression: Towards distributed deep learning with minimal communication." 2019 International Joint Conference on Neural Networks (IJCNN).

Page 34: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Sparse Binary Compression

Page 35: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Sparse Binary Compression for non-iid Data

Page 36: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Combining Methods

Sattler, Wiedemann, Müller, Samek. "Robust and communication-efficient federated learning from non-iid data." IEEE TNNLS (2019).

Page 37: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Efficient Encoding – DeepCABAC

● CABAC best encoder for quantized parameter tensors

● Plug & Play

● Can be used as a final lossless compression stage for all compression methods that we have presented

Wiedemann et al. "DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks."

Page 38: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Challenges

Challenges in Federated Learning

Convergence

Heterogeneity

Privacy

Robustness Personalization

Communication

( )✔

Page 39: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Privacy

Hitaj, Briland, Giuseppe Ateniese, and Fernando Perez-Cruz. "Deep models under the GAN: information leakage from collaborative deep learning."

Page 40: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Privacy

Data Data Data

Server

Page 41: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Privacy

Privacy Protection Mechanisms:

- Secure Multi-Party Computation

- Homomorphic Encryption

- Trusted Execution Environments

- Differential Privacy

crypto

Dwork, Cynthia, and Aaron Roth. "The algorithmic foundations of differential privacy."

Page 42: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Differential Privacy

A mechanism is called

differentially private with parameter iff,

t

P[A(D)=t]P[A(D’)=t]

for any two data sets and which differ in only one element.

Page 43: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Differential Privacy – Mechanisms

Global Sensitivity:

The Laplace mechanism with is - differentially private.

Data

Global Sensitivity:

+ NOISE

Page 44: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Differential Privacy – Post Processing

Data released via a - private mechanism is - private under arbitrary post processing!

Datadifferentially

privatealgorithm

non-privatepost-

processing

privacy barrier

Page 45: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Differential Privacy – Basic Composition

→ privacy loss is additive!

After applying R algorithms with

the total privacy loss is:

This is a worst-case analysis – better bounds can be found using more elaborated accounting mechanisms (e.g. moments accountant)

to the data

Page 46: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Privacy and Communication

We need methods which are both communication-efficient and pivacy-preserving!

Differential Privacy:

➔ adds artificial noise to the parameter updates to obfuscate them

Compression Methods:

➔ add quantization noise to the parameter updates to reduce the bitwidth

→ combine the two approaches!

Li, Tian, et al. "Privacy for Free: Communication-Efficient Learning with Differential Privacy Using Sketches."

Page 47: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Challenges

Challenges in Federated Learning

Convergence

Heterogeneity

Privacy

Robustness Personalization

Communication

( )✔

✔✔

Page 48: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Meta- and Multi Task- Learning

Federated Learning Environments are characterized by a high degree of statistical heterogeneity of the client data

→ In many situations, learning one single central model is suboptimal or even undesirable

Page 49: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Meta- and Multi Task- Learning

→ IID data → non-IID data

Client data:

→ one model can be learned

→ no single model can fit the data of all clients

Page 50: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Meta- and Multi Task- Learning

I like ...

I like ice cream.

I like cats.

I like Beyoncé.

A linear classifier can correctly separate the data of every single client, but not simultaneously for all clients

Page 51: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning

Data

Data

Data

Data

Data

Data

Data

Data

Data

Server

How to identify the clusters?

→ Clustered Federated Learning groups the client population into clusters with jointly trainable data distributions and trains a separate model for every cluster

Page 52: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning

Data

Data

Data

Data

Data

Data

Data

Data

Data

Server

How to identify the clusters?

→ via the model updates!

Page 53: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning

-1

1

Cosine-similarity

Communication rounds

At every stationary solution of theFederated Learning objective, theangle between the parameter updates of the different clients is highly indicative of their distribution similarity!

Page 54: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning - Algorithm

1.) Run Federated Learning until convergence to a stationary solution

2.) Compute the pairwise cosine similarity between the latest parameter updates from all clients

3.) If there exists a client whose local empirical risk is not sufficiently minimized by the federated learning solution ...

4.) … then bi-partition the client population into two groups of minimal pairwise similarity

5.) Repeat everything for the two groups, starting from 1.)

Page 55: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning – Clustering Guarantees

Let

and

Then the proposed mechanism will correctly separate the clients if

with

and being the number of clusters.

Page 56: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Clustered Federated Learning

Page 57: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning

Measure cluster quality via:

Then:

Page 58: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning Few data points are sufficient to obtain correct clustering

Few communication rounds are sufficient in order to obtain a correct clustering

Work also with weight updates

Page 59: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data

Data Data

Data

Data

Data

Data

Data

Federated Learning 1st Split 2nd Split 3rd Split

Page 60: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Clustered Federated Learning 1) FL has converged to a stationary

solution

2) After the 1st split Accuracy drastically increases for the group of clients that was separated out

3) After the third round of splitting g(a) has reduced to below zero for all remaining clusters

Sattler, Müller, Samek. "Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints."

Page 61: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning – Clustered Federated Learning

Sattler, Müller, Samek. “On the Byzantine Robustness of Clustered Federated Learning” (ICASSP 2020)

Page 62: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning - Challenges

Challenges in Federated Learning

Convergence

Heterogeneity

Privacy

Robustness Personalization

Communication

✔✔

✔✔

Page 63: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Neural Network Compression

References

S Wiedemann, H Kirchhoffer, S Matlage, P Haase, A Marban, T Marinc, D Neumann, T Nguyen, A Osman, H Schwarz, D Marpe, T Wiegand, W Samek. DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks. IEEE Journal of Selected Topics in Signal Processing, 14(3):1-15, 2020.http://dx.doi.org/10.1109/JSTSP.2020.2969554

S Yeom, P Seegerer, S Lapuschkin, S Wiedemann, KR Müller, W Samek. Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning. arXiv:1912.08881, 2019.https://arxiv.org/abs/1912.08881

S Wiedemann, H Kirchhoffer, S Matlage, P Haase, A Marban, T Marinc, D Neumann, A Osman, D Marpe, H Schwarz, T Wiegand, W Samek. DeepCABAC: Context-adaptive binary arithmetic coding for deep neural network compression. Joint ICML'19 Workshop on On-Device Machine Learning & Compact Deep Neural Network Representations (ODML-CDNNR), 1-4, 2019. *** Best paper award ***https://arxiv.org/abs/1905.08318

S Wiedemann, A Marban, KR Müller, W Samek. Entropy-Constrained Training of Deep Neural Networks.Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-8, 2019.http://dx.doi.org/10.1109/IJCNN.2019.8852119

Page 64: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Efficient Deep Learning

References

S Wiedemann, KR Müller, W Samek. Compact and Computationally Efficient Representation of Deep Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, 31(3):772-785, 2020.http://dx.doi.org/10.1109/TNNLS.2019.2910073

S Wiedemann, T Mehari, K Kepp, W Samek. Dithered backprop: A sparse and quantized backpropagation algorithm for more efficient deep neural network training. Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning in Computer Vision, 2020.http://arxiv.org/abs/2004.04729

A Marban, D Becking, S Wiedemann, W Samek. Learning Sparse & Ternary Neural Networks with Entropy-Constrained Trained Ternarization (EC2T). Proceedings of the CVPR'20 Joint Workshop on Efficient Deep Learning in Computer Vision, 2020.http://arxiv.org/abs/2004.01077

Page 65: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

References

F Sattler, T Wiegand, W Samek. Trends and Advancements in Deep Neural Network Communication. ITU Journal: ICT Discoveries, 2020.https://arxiv.org/abs/2003.03320

F Sattler, KR Müller, W Samek. Clustered Federated Learning: Model-Agnostic Distributed Multi-Task Optimization under Privacy Constraints. arXiv:1910.01991, 2019.https://arxiv.org/abs/1910.01991

F Sattler, S Wiedemann, KR Müller, W Samek. Robust and Communication-Efficient Federated Learning from Non-IID Data. IEEE Transactions on Neural Networks and Learning Systems, 2019.http://dx.doi.org/10.1109/TNNLS.2019.2944481

F Sattler, KR Müller, W Samek. Clustered Federated Learning. Proceedings of the NeurIPS'19 Workshop on Federated Learning for Data Privacy and Confidentiality, 1-5, 2019.

Page 66: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

IEEE ICASSP 2020 Tutorial on Distributed and Efficient Deep Learning

Federated Learning

References

F Sattler, KR Müller, T Wiegand, W Samek. On the Byzantine Robustness of Clustered Federated Learning. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 8861-8865, 2020.http://dx.doi.org/10.1109/ICASSP40776.2020.9054676

F Sattler, S Wiedemann, KR Müller, W Samek. Sparse Binary Compression: Towards Distributed Deep Learning with minimal Communication. Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), 1-8, 2019.http://dx.doi.org/10.1109/IJCNN.2019.8852172

Page 67: IEEE ICASSP 2020 Tutorial on ... - federated-ml.org · Traditional Centralized (“Cloud”) ML Data ... “Federated Learning is a machine learning setting where multiple entities

Slides and Papers available at