Publication:
Asymptotic and finite-sample properties of estimators based on stochastic gradients

Thumbnail Image

Date

2017

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Mathematical Statistics
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Toulis, Panos, and Edoardo M. Airoldi. 2017. “Asymptotic and Finite-Sample Properties of Estimators Based on Stochastic Gradients.” The Annals of Statistics 45 (4) (August): 1694–1727. doi:10.1214/16-aos1506.

Research Data

Abstract

Stochastic gradient descent procedures have gained popularity for parameter estimation from large data sets. However, their statis- tical properties are not well understood, in theory. And in practice, avoiding numerical instability requires careful tuning of key param- eters. Here, we introduce implicit stochastic gradient descent proce- dures, which involve parameter updates that are implicitly defined. Intuitively, implicit updates shrink standard stochastic gradient de- scent updates. The amount of shrinkage depends on the observed Fisher information matrix, which does not need to be explicitly com- puted; thus, implicit procedures increase stability without increas- ing the computational burden. Our theoretical analysis provides the first full characterization of the asymptotic behavior of both stan- dard and implicit stochastic gradient descent-based estimators, in- cluding finite-sample error bounds. Importantly, analytical expres- sions for the variances of these stochastic gradient-based estimators reveal their exact loss of efficiency. We also develop new algorithms to compute implicit stochastic gradient descent-based estimators for generalized linear models, Cox proportional hazards, M-estimators, in practice, and perform extensive experiments. Our results suggest that implicit stochastic gradient descent procedures are poised to be- come a workhorse for approximate inference from large data sets.

Description

Other Available Sources

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories