Online Robust Regression via SGD on the l1 loss

Pesme, Scott; Flammarion, Nicolas

Computer Science > Machine Learning

arXiv:2007.00399 (cs)

[Submitted on 1 Jul 2020]

Title:Online Robust Regression via SGD on the l1 loss

Authors:Scott Pesme, Nicolas Flammarion

View PDF

Abstract:We consider the robust linear regression problem in the online setting where we have access to the data in a streaming manner, one data point after the other. More specifically, for a true parameter $\theta^*$, we consider the corrupted Gaussian linear model $y = \langle x , \ \theta^* \rangle + \varepsilon + b$ where the adversarial noise $b$ can take any value with probability $\eta$ and equals zero otherwise. We consider this adversary to be oblivious (i.e., $b$ independent of the data) since this is the only contamination model under which consistency is possible. Current algorithms rely on having the whole data at hand in order to identify and remove the outliers. In contrast, we show in this work that stochastic gradient descent on the $\ell_1$ loss converges to the true parameter vector at a $\tilde{O}( 1 / (1 - \eta)^2 n )$ rate which is independent of the values of the contaminated measurements. Our proof relies on the elegant smoothing of the non-smooth $\ell_1$ loss by the Gaussian data and a classical non-asymptotic analysis of Polyak-Ruppert averaged SGD. In addition, we provide experimental evidence of the efficiency of this simple and highly scalable algorithm.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
Cite as:	arXiv:2007.00399 [cs.LG]
	(or arXiv:2007.00399v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2007.00399

Submission history

From: Scott Pesme [view email]
[v1] Wed, 1 Jul 2020 11:38:21 UTC (900 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-07

Change to browse by:

cs
math
math.OC
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Nicolas Flammarion

export BibTeX citation

Computer Science > Machine Learning

Title:Online Robust Regression via SGD on the l1 loss

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Online Robust Regression via SGD on the l1 loss

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators