Publication: Efficient and Scalable Tiny Machine Learning
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
Machine learning (ML) has revolutionized computing, but for ML to become fully ubiquitous, it must be deployed on the billions of ultra-low-power sensing devices that already permeate society. This requires extreme levels of efficiency that are typically achieved through expensive specialization and optimization; however, this is impractical given the diversity of use cases we wish to address. We need scalable solutions that achieve efficiency without incurring substantial engineering costs.
In this thesis, we discuss techniques to achieve extreme efficiency across the ML stack by creating benchmarks, automating model design, and bootstrapping datasets. First, this thesis introduces MLPerf Tiny, a benchmark for ultra-low-power ML (TinyML) hardware, which enables researchers to isolate the impact of individual optimizations and make progress more measurable and tractable. Second, we characterize ML performance on commodity microcontrollers (MCUs) and automate the process of hardware-specific model design with a fast neural architecture search tool. This enables state-of-the-art energy efficiency without substantial manual effort or a computationally expensive search algorithm. Finally, we created Wake Vision, a large, high-quality dataset for TinyML person detection. We demonstrate that we can bootstrap TinyML-relevant datasets through automated data filtering techniques. Additionally, Wake Vision includes a fine-grain benchmark suite to measure the robustness and fairness of a model in challenging settings. The contributions described in this dissertation establish a foundation for future TinyML research and chart a path toward smart, ubiquitous computing.