Publication:
An improved analysis of the ER-SpUD dictionary learning algorithm

Thumbnail Image

Date

2016

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Błasiok, Jarosław, and Jelani Nelson. 2016. "An improved analysis of the ER-SpUD dictionary learning algorithm." Working paper.

Research Data

Abstract

In "dictionary learning" we observe Y=AX+E for some Y∈ℝn×p, A∈ℝm×n, and X∈ℝm×p. The matrix Y is observed, and A,X,E are unknown. Here E is "noise" of small norm, and X is column-wise sparse. The matrix A is referred to as a {\em dictionary}, and its columns as {\em atoms}. Then, given some small number p of samples, i.e.\ columns of Y, the goal is to learn the dictionary A up to small error, as well as X. The motivation is that in many applications data is expected to sparse when represented by atoms in the "right" dictionary A (e.g.\ images in the Haar wavelet basis), and the goal is to learn A from the data to then use it for other applications. Recently, [SWW12] proposed the dictionary learning algorithm ER-SpUD with provable guarantees when E=0 and m=n. They showed if X has independent entries with an expected s non-zeroes per column for 1≲s≲n‾√, and with non-zero entries being subgaussian, then for p≳n2log2n with high probability ER-SpUD outputs matrices A′,X′ which equal A,X up to permuting and scaling columns (resp.\ rows) of A (resp.\ X). They conjectured p≳nlogn suffices, which they showed was information theoretically necessary for {\em any} algorithm to succeed when s≃1. Significant progress was later obtained in [LV15]. We show that for a slight variant of ER-SpUD, p≳nlog(n/δ) samples suffice for successful recovery with probability 1−δ. We also show that for the unmodified ER-SpUD, p≳n1.99 samples are required even to learn A,X with polynomially small success probability. This resolves the main conjecture of [SWW12], and contradicts the main result of [LV15], which claimed that p≳nlog4n guarantees success whp.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories