Publication: Machine Learning for Machines: Data-Driven Performance Tuning at Runtime Using Sparse Coding
No Thumbnail Available
Date
2015-01-21
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Tarsa, Stephen J. 2015. Machine Learning for Machines: Data-Driven Performance Tuning at Runtime Using Sparse Coding. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
Research Data
Abstract
We develop methods for adjusting device configurations to runtime conditions based on system-state predictions. Our approach statistically models performance data collected by either actively probing conditions such as wireless link quality, or leveraging existing infrastructure such as hardware performance counters. By predicting future runtime characteristics, we enable on-the-fly changes to wireless transmission schedule, voltage and frequency in circuits, and data placement in storage systems. In highly-variable everyday use-cases, we demonstrate large performance gains not by designing new protocols or system configurations, but by more-judiciously using those that exist.
This thesis presents a state-modeling framework based on sparse feature represen- tation. It is applied in diverse application scenarios to data representing:
1. Packet loss over diverse wireless links
2. Circuit performance counters collected during user-driven workloads
3. Access pattern statistics measured from data- center storage systems
Our framework uses unsupervised clustering to discover latent statistical structure in large datasets. We exploit this stable structure to reduce overfitting in supervised learning models like Support Vector Machine (SVM) classifiers and Classification and Regression Trees (CART) trained on small datasets. As a result, we can capture transient predictive statistics that change based on wireless environment, circuit workload, and storage application. Given the magnitude of performance improvements and the potential economic opportunity, we hope that this work becomes the foundation for a broad investigation into on-platform data-driven device optimization, dubbed Machine Learning for Machines (MLM).
Description
Other Available Sources
Keywords
Computer Science
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service