Publication:

Implicit methods for iterative estimation with large data sets

Loading...
Thumbnail Image

Date

2016-04-25

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Toulis, Panagiotis. 2016. Implicit methods for iterative estimation with large data sets. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Abstract

The ideal estimation method needs to fulfill three requirements: (i) efficient computation, (ii) statistical efficiency, and (iii) numerical stability. The classical stochastic approximation of (Robbins, 1951) is an iterative estimation method, where the current iterate (parameter estimate) is updated according to some discrepancy between what is observed and what is expected assuming the current iterate has the true parameter value. Classical stochastic approximation undoubtedly meets the computation requirement, which explains its widespread popularity, for example, in modern applications of machine learning with large data sets, but cannot effectively combine it with efficiency and stability. Surprisingly, the stability issue can be improved substantially, if the aforementioned discrepancy is computed not using the current iterate, but using the conditional expectation of the next iterate given the current one. The computational overhead of the resulting implicit update is minimal for many statistical models, whereas statistical efficiency can be achieved through simple averaging of the iterates, as in classical stochastic approximation (Ruppert, 1988). Thus, implicit stochastic approximation is fast and principled, fulfills requirements (i-iii) for a number of popular statistical models including generalized linear models, M-estimation, and proportional hazards, and it is poised to become the workhorse of estimation with large data sets in statistical practice.

Description

Other Available Sources

Research Data

Keywords

Statistics, Operations Research

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories