Publication: Towards Learned Access Path Selection: Using Artificial Intelligence to Determine the Decision Boundary of Scan vs Index Probes in Data Systems
No Thumbnail Available
Date
2020-03-03
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Kastroulis, Angelo. 2019. Towards Learned Access Path Selection: Using Artificial Intelligence to Determine the Decision Boundary of Scan vs Index Probes in Data Systems. Master's thesis, Harvard Extension School.
Research Data
Abstract
All data systems require optimization. Determining the most performant recipe for data retrieval is commonly referred to as access path selection (APS). This paper introduces the use of artificial neural networks for query optimization in APS.
The problems with query optimizers are well known. Promising research in other areas of query optimization using machine learning techniques continues at a rapid pace. Research has proposed machine learning solutions for join order enumeration, cardinality estimation, and predicting optimizer heuristics (such as cost estimation and table statistics). However, a practical method for learned APS remains an open research topic.
APS is difficult because an optimizer must be aware of the ever-changing system state, hardware, and data. Incorrect assumptions in any of those can be very costly, and finding a solution requires years of research.
In this thesis, I present an artificial intelligence-based approach to APS and introduce a learned optimization method using neural networks. Moreover, I explore the challenges inherent in applying generalized neural network techniques to APS. I empirically show that these networks significantly outperform the query optimizer's accuracy in determining the truly performant access path.
Description
Other Available Sources
Keywords
APS, access path selection, MLP, multi-layer perceptron, AI, neural networks, query optimizer, index, b+tree, machine learning, data systems
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service