Critical Points Of An Autoencoder Can Provably Recover...

[PDF][PDF] Critical points of an autoencoder can provably recover sparsely used overcomplete dictionaries

A Rangamani, A Mukherjee, A Arora… - arXiv preprint arXiv …, 2017 - researchgate.net

A Rangamani, A Mukherjee, A Arora, T Ganapathy, A Basu, SP Chin, TD Tran

arXiv preprint arXiv:1708.03735, 2017•researchgate.net

Abstract In Dictionary Learning one is trying to recover incoherent matrices A∗∈ Rn× h
(typically overcomplete and whose columns are assumed to be normalized) and sparse
vectors x∗∈ Rh with a small support of size hp for some 0< p< 1 while being given access
to observations y∈ Rn where y= A∗ x∗. In this work we undertake a rigorous analysis of the
possibility that dictionary learning could be performed by gradient descent on Autoencoders,
which are Rn→ Rn neural network with a single ReLU activation layer of size h. Towards the …

Abstract

In Dictionary Learning one is trying to recover incoherent matrices A∗∈ Rn× h (typically overcomplete and whose columns are assumed to be normalized) and sparse vectors x∗∈ Rh with a small support of size hp for some 0< p< 1 while being given access to observations y∈ Rn where y= A∗ x∗. In this work we undertake a rigorous analysis of the possibility that dictionary learning could be performed by gradient descent on Autoencoders, which are Rn→ Rn neural network with a single ReLU activation layer of size h.

Towards the above objective we propose a new autoencoder loss function which modifies the squared loss error term and also adds new regularization terms. We create a proxy for the expected gradient of this loss function which we motivate with high probability arguments, under natural distributional assumptions on the sparse code x∗. Under the same distributional assumptions on x∗, we show that, in the limit of large enough sparse code dimension, any zero point of our proxy for the expected gradient of the loss function within a certain radius of A∗ corresponds to dictionaries whose action on the sparse vectors is indistinguishable from that of A∗. We also report simulations on synthetic data in support of our theory.

researchgate.net

Show moreShow less

Save Cite Cited by 6 Related articles View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

[PDF][PDF] Critical points of an autoencoder can provably recover sparsely used overcomplete dictionaries