0% found this document useful (0 votes)
31 views52 pages

Map578 5

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 52

EA MAP 578: Emerging topics in ML

-
Collaborative and Reliable Learning

Aymeric DIEULEVEUT - El Mahdi EL MHAMDI

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


What’s this lecture about

Objectives of the lecture


1 Introduction to Collaborative Learning
An exciting research domain
Focusing on recent advances (2015 - 2023 - )
Connecting multiple fields (mathematics, computer science, economics,
social sciences)
2 Introduction to research
Presenting both the intuition and mathematical results
Reading papers!

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Organization of the lecture

9 lectures:
1 Lectures on some crucial topics (Lectures 1 - 5)
2 Reading articles in small groups (Lectures 6- 7)
3 Presenting those articles (Lectures 8 - 9)

Each Tuesday afternoon, 2 sequences.


Starting 13h30 - Ending 17h30-18h

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Organization of the lecture

9 lectures:
1 Lectures on some crucial topics (Lectures 1 - 5)
2 Reading articles in small groups (Lectures 6- 7)
3 Presenting those articles (Lectures 8 - 9)

Each Tuesday afternoon, 2 sequences.


Starting 13h30 - Ending 17h30-18h

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Detailed Organization - Tentative

Date Program first 2h Program next 2h


1 19/09 General introduction AD & EE Topic 1: Stochastic Optimisation EE
2 26/09 Topic 2: Distributed (centralized) EE Topic 3: Fault tolerance EE
- 03/10 No lecture
- 10/10 No lecture
3 17/10 Topic 4: AI ALinement EE Articles overview and pre-choice EE
4 24/10 Topic 5:Decentralized AD Topic 6: Heterogeneity AD
5 07/11 Topic 7: Compression AD Topic 8 Differential Privacy AD
6 14/11 Article choice and preparation 2 EE Article preparation 3 EE
7 21/11 Article preparation 4 AD & EE Article preparation 5 EE
8 28/11 Article preparation 6 EE Article preparation 7 EE
9 05/12 Article presentation (1/2) AD & EE Article presentation (2/2) AD & EE
10 12/12 Evaluation week - no session
11 19/12 Evaluation week

Spreadsheet for paper selection (the list is indicative and will be evolving over
the next 1-3 weeks)
https://docs.google.com/spreadsheets/d/
1WkmYHmFUMnS0FjM8UhX1z2S2xKGCj0vUJnnSRCHIR2E

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Who are we?

Aymeric DIEULEVEUT
Assistant professor at Polytechnique, CMAP
Interests:
Optimization and Statistics, links between the two aspects
Large Scale Learning
Federated, Distributed, Privacy preserving learning

Contact: [email protected]

El Mahdi EL MHAMDI
Assistant professor at Polytechnique, CMAP
Interests:
Distributed systems, distributed algorithms
Robustness, fault tolerance
Computable ethics (mathematics, analytical philosophy, social sciences. . . )

Contact: [email protected]

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Outline - Today

1 First sequence: general introduction to Collaborative Learning.


Description of the main challenges.
# 90 min journey in 50 years of learning
2 Second sequence: Stochastic Algorithms.
# Main workhorse of learning: a few convergence rates !

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Programming-based computer science

Can you say if a number is a prime number?


Can you order a list?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Deduction versus Induction

Laws from data


~ext .
Better laws, e.g, m · ~a = ΣF
Classic CS = Automating Deduction
Machine Learning = Automating Induction
Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning
Détection de spam

Donnéees d’apprentissage: courriels


Entrée: courriel (textes, header - émetteur, récepteurs, smtp)
Réponse : “Spam”/“Non Spam”

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Détection de visages

Données d’apprentissage: images annotées


Entrée : imagettes (32 × 32 pixels par exemple)
Réponse : “ Visage / “Non Visage”...

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Reconnaissance écriture manuscrite

Données: bases de données de chiffres (chaque image est un vecteur


28 × 28 = 784 de pixel - niveau de gris)
Entrée: Image
Sortie: Chiffre reconnu

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Apprentissage supervisé

Apprentissage supervisé

Attributs: X = (X (1) , X (2) , . . . , X (d) ) ∈ X


Réponses: Y ∈ Y.
(X, Y ) ∼ P où P est inconnue.
Ensemble d’apprentissage : Dn = {(X1 , Y1 ), . . . , (Xn , Yn )} (i.i.d. ∼ P)
– X ∈ Rd et Y ∈ {−1, 1} ou {0, 1} (classification)
– X ∈ Rd et Y ∈ R (régression).

Un classifieur ou prédicteur est une fonction g : X → Y mesurable.

Objectif

Construire un bon prédicteur / classifieur gb en utilisant les données


d’apprentissage

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Perte et risque

Fonction de perte

Fonction de perte : `(y, g(x)) quantifie la qualité de la prévision g(x) de y.


Exemples:
Perte 0-1 (classification): `(y, g(x)) = 1{y6=g(x)} , y ∈ Y = {0, 1}
Perte quadratique (regression): `(y, g(x)) = |y − g(x)|2

Risque d’une règle de décision

Risque: R(g) = E[`(Y, g(X))]


Exemples:
Perte 0-1 (classification) : R(g) = P(Y 6= g(X))
Perte quadratique (regression): R(g) = E[|Y − g(X)|2 ]

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Apprentissage supervisé

Ensemble d’apprentissage : Dn = {(X1 , Y1 ), . . . , (Xn , Yn )} (i.i.d. ∼ P)


Règle de décision (classifieur, prédicteur): g : X → Y mesurable
Perte: `(Y, g(X))
Risque: R(g) = E[`(Y, g(X))]

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Bayes predictors, LS regression

Il existe un prédicteur optimal, noté g Bayes ou g ∗ .


Par exemple, pour la regression des moindres carrés,

g ∗ = arg min E[(g(X) − Y )2 ].


g∈RX

Alors
g ∗ (X) = E[Y |X].
c’est la fonction de régression.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Sur-apprentissage / Sous Apprentissage I

1. Décomposition de l’excès de risque et choix de C. Soit R(gC∗ ) tel que


R(gC∗ ) = inf g∈C R(g)

R(ĝn,C ) − R(g Bayes ) = R(ĝn,C ) − R(gC∗ ) + R(gC∗ ) − R(g Bayes )


| {z } | {z }
Erreur d’estimation Erreur d’approximation

Taille C %
Err. Approximation &
Err. Estimation %

L’erreur d’approximation peut être grande si C est mal choisi


L’erreur d’estimation peut être grande si C est complexe (nécessite d’être défini).

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Sur-apprentissage / Sous-apprentissage

10
8
6
4
2
0

0 2 4 6 8 10

complexite

Figure: Comportement du risque d’approximation et du risque d’estimation en fonction


de la complexité du modèle C

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Points de vue statistique / optimisation

Problèmes
Comment choisir C?
Comment sélectionner une règle de décision dans C ?

Approche “Générative”
Solution: Estimer la fonction de régression η(X) = P ( Y = 1 | X) et substituer
cet estimateur dans la règle bayésienne: Modèles linéaires (généralisés),
méthodes à noyaux, k-plus proches voisins, Bayes naı̈f,...

Approche “Optimisation”
Solution: Minimiser le risque empirique (ou une borne supérieure du risque
empirique): machines à vecteurs de supports, réseaux de neurones,

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Minimisation du risque empirique

Nous disposons d’un ensemble d’apprentissage

Dn = {(X1 , Y1 ), . . . , (Xn , Yn )}

Nous pouvons donc estimer le risque empirique


n
X
b n (g) = n−1
R `(g(Xi ), Yi )
i=1

Comme E[`(Xi , Yi )] = R(g), R


b n (g) est un estimateur sans biais de R(g).
b n (g) P−prob
De plus, pour tout g, la LGN montre que R −→ R(g) donc, pour
n → ∞, R b n (g) ≈ R(g)
Idée (?): remplacer le risque par le risque empirique !...

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Paramétrisation de la classe de fonctions

1 linéaire: moindre carrés, logistique, SVM

2 non-linéaire: réseaux de neurones

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Grande taille des modèles et grand nombre d’observations

1 Taille des modèles


Les modèles utilisés sont de taille croissante à grande vitesse.
De nombreux réseaux de neurones comptent plusieurs millions de
paramètres.
La dimension d de l’espace sur lequel on doit optimiser le risque empirique
est donc très grande.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Optimisation problem

Optimisation

arg min R
b n (g) (1)
g∈C

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Optimisation problem

Optimisation

arg min R
b n (g) (1)
g∈C

We thus focus on the following optimization problem


Optimisation

arg min F (θ) (2)


θ∈C⊂Rd

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Optimisation

Some questions in Optimization


How to tackle such an optimization problem?
1 Grid search
2 Use local information: algorithmic question!
What makes it difficult?
How fast do algorithms converge?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Optimisation

Some questions in Optimization


How to tackle such an optimization problem?
1 Grid search
2 Use local information: algorithmic question!
What makes it difficult?
How fast do algorithms converge?

This afternoon: Stochastic Optimization.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 1: distribution

Distribution
1 Several workers/agents/nodes share the data or the model
2 Data distribution : each holds a share of the data
3 Model distribution : each holds a share of the model

Figure: Different settings for communication

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 1: distribution

Data distributed framework: Optimization problem

N
X
arg min Fi (θ) (3)
θ∈C⊂Rd i=1

Each Fi is the (empirical) loss on the data hold by worker i ∈ [N ].

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 1: distribution

Data distributed framework: Optimization problem

N
X
arg min Fi (θ) (3)
θ∈C⊂Rd i=1

Each Fi is the (empirical) loss on the data hold by worker i ∈ [N ].

1 N

Lool FI FN
loss

overall n
F w Fi
loss fu

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 1: distribution

Settings and key questions:


1 Synchronous vs Asynchronous framework
2 Centralized: how often do we communicate with the server?
3 Decentralized: what is the network topology? how fast does it take for
information to propagate?
Questions:
1 What is the speed up compared to the single workers setting?
2 How does the network topology influence convergence in decentralized?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 1: distribution

Settings and key questions:


1 Synchronous vs Asynchronous framework
2 Centralized: how often do we communicate with the server?
3 Decentralized: what is the network topology? how fast does it take for
information to propagate?
Questions:
1 What is the speed up compared to the single workers setting?
2 How does the network topology influence convergence in decentralized?

Next week: Distributed Optimization.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 2: Heterogeneity

I 1 NN

Distribution
jin R z na

Di Dj

Heterogeneity

1 Multiple sources of heterogeneity: label of feature shift?


2 From the optimization pov, assumption on “simple quantities”

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 2: Heterogeneity and adaptation?

Two different problems?

1 Averaging consensus:
N
X
arg min Fi (θ) (4)
θ∈C⊂Rd i=1

2 Adaptation:
N
X
arg min Fi (θi ) (5)
(θi )i∈[N ] ∈C N ⊂(Rd )N i=1

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 3: Privacy

I 1 N
a

Observations
z si je z
k 1

Each agent wants:


1 To keep his data on his own storage
2 To ensure that sensitive information on his dataset cannot be recovered
form the observation of the model / all the training process / performing
inference / etc.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 3: Privacy - Solutions

Some of the main directions


1 Ensure that any individual datapoint does not influence the model too
much
2 Encrypt the data. Can we still perform learning, or inference on an
encrypted data?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge 4: Compression, partial participation

Communication constraints
1 Communication can be the bottleneck in distributed systems: can we get a
speedup?
2 Uploading and downloading updates: saturation of networks, bandwidth?
3 Un-availability of some workers?

1 N a

Upload

Download

a oo.co

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Summary: The Federated Learning framework I

Federated Learning

1 Framework combining several aspects of collaborative learning


2 “Introduced” in 2016, attracted a lot of interest.
3 Numerous start-ups/ industrial working on the topic !

1 Centralized (in the classical definition)


2 Learning from a set of N agents:
 
 N 
1 X
 
min F (θ) := Ez∼Di [`(z, θ)] .
θ∈R d 
 N i=1 | {z }

Fi (θ)

3 Tackles both
the averaging consensus problem
the adaptation problem
4 Data distribution:
cross silo
cross device

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Summary: The Federated Learning framework II

5 Some concerns
Privacy
Non i.i.d. agents
Optimization with bandwidth constraints, Partial participation
6 Important implementation issues

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Two Classical Examples

Collaboration between hospitals:

Figure: Map of the hospitals in 13-14th arrondissements

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Two Classical Examples

Collaboration between hospitals:

Figure: Map of the hospitals in 13-14th arrondissements

Building a collaborative and personalized text model:

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Two Classical Examples

Collaboration between hospitals:

Figure: Map of the hospitals in 13-14th arrondissements

Building a collaborative and personalized text model:

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Challenge: Robustness

How bad is it if?


1 Some of the workers have a different distribution?
2 Some of the workers make mistakes?
3 Some of the workers actively try to degrade the learning process?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Robustness from distribution

The single point of failure problem throughout disciplines.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Robustness from distribution

ARPANET cold war motivation: how to make sure information is still available
and possible to disseminate after a nuclear attack?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Robustness from distribution

ARPANET cold war motivation: how to make sure information is still available
and possible to disseminate after a nuclear attack?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Lack of robustness because of distribution

How to manage the messy problem of synchronizing nodes, making them agree
on values despite crashes, errors in data, asynchrony etc.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Lack of robustness because of distribution

How to manage the messy problem of synchronizing nodes, making them agree
on values despite crashes, errors in data, asynchrony etc.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Fault tolerance

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Loss functions from high level goals

From business models to loss functions

(the problem is the same for every other recommender algorithm)

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Loss functions from high level goals

From business models to loss functions :


what are the features and the labels in a recommender system?

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


The alignment problem

Beyond gadgets, machine learning is about automating inductive thinking, i.e.


inferring laws from observations.
(Most of mathematics is about deductive thinking ).
However, this can hardly be done without proxy measures: loss functions.
Goodhart’s law
When a measure becomes a target, it ceases to be a good measure.

The proxy will inevitably be different from the intended goal, but we can study
the mismatch and avoid pitfalls.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning


Other questions (not covered in the class*)

Industrial perspective:
1 How to get people to engage
2 Loss functions from business models
Insurance:
1 Liability for loss of privacy in a Privacy preserving framework
Legal aspects:
1 How to we define privacy in the law?
2 Role of GDPR?
Economics:
1 Data valuation?
2 Value Sharing?
Cryptography:
1 Role of homomorphic encryption?
2 MPC?
*: but for which you will be (technically) better equipped by the end of the
class.

Dieuleveut & El Mhamdi Emerging topics in ML : Collaborative and Reliable Learning

You might also like