Towards Learned Access Path Selection: Using Artificial Intelligence to Determine the Decision Boundary of Scan vs Index Probes in Data Systems
CitationKastroulis, Angelo. 2019. Towards Learned Access Path Selection: Using Artificial Intelligence to Determine the Decision Boundary of Scan vs Index Probes in Data Systems. Harvard Extension School Thesis.
AbstractAll data systems require optimization. Determining the most performant recipe for data retrieval is commonly referred to as access path selection (APS). This paper introduces the use of artificial neural networks for query optimization in APS. The problems with query optimizers are well known. Promising research in other areas of query optimization using machine learning techniques continues at a rapid pace. Research has proposed machine learning solutions for join order enumeration, cardinality estimation, and predicting optimizer heuristics (such as cost estimation and table statistics). However, a practical method for learned APS remains an open research topic. APS is difficult because an optimizer must be aware of the ever-changing system state, hardware, and data. Incorrect assumptions in any of those can be very costly, and finding a solution requires years of research. In this thesis, I present an artificial intelligence-based approach to APS and introduce a learned optimization method using neural networks. Moreover, I explore the challenges inherent in applying generalized neural network techniques to APS. I empirically show that these networks significantly outperform the query optimizer’s accuracy in determining the truly performant access path.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:42659221