Publication:

Combining Datalog and SAT-Based Solving in Code-Reasoning Tools

Loading...
Thumbnail Image

Date

2023-09-07

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Bembenek, Aaron. 2023. Combining Datalog and SAT-Based Solving in Code-Reasoning Tools. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Abstract

Datalog and SAT solving are powerful forms of automated logic that are both used to reason about the correctness of software. However, they are very different, and they are not typically combined in the same automated reasoning tool. This dissertation explores how to build code-reasoning tools that combine these complementary technologies, through two main thrusts: 1) the design and implementation of the logic-centric programming language Formulog, and 2) a progression of tools for Datalog program synthesis. In doing so, it explores the challenge of combining Datalog and SAT-based reasoning at multiple levels, viewing it variously as a language design problem, a system implementation problem, and a constraint solving problem. A domain-specific language for writing static analyses (programs that analyze other programs), Formulog augments Datalog with first-order functional programming and SMT solving, an extension to SAT solving. In Formulog, SMT formulas are data that can be constructed and inspected; a novel type system allows expressive SMT formulas while ensuring that neither normal evaluation nor SMT solving crashes. Formulog’s combination of features makes it possible to state a range of SMT-based analyses in a way that is both close to their formal specifications, and amenable to high-level optimizations and efficient evaluation. In addition to the design of Formulog, this dissertation presents implementation strategies for achieving good runtime performance. This includes a compiler from Formulog to Soufflé (an established Datalog system), as well as “eager evaluation,” a novel parallel execution algorithm for Datalog that leads to improved SMT solving times when used to evaluate Formulog programs. Thanks to these techniques, Formulog’s performance is competitive with (and sometimes much better than) non-Datalog reference implementations on diverse SMT-based case studies: refinement type checking, bottom-up points-to analysis, and symbolic execution. Finally, the dissertation presents a progression of tools for synthesizing Datalog programs (i.e., generating them automatically from specifications). Each tool explores a different way of combining Datalog and SAT/SMT solving; the ultimate solution takes advantage of answer set programming (ASP)—a constraint solving paradigm that subsumes Datalog and SAT-like reasoning—and achieves nearly order-of-magnitude geometric mean speedups over the prior state-of-the-art solution. This work elucidates connections between Datalog and SAT/SMT solving (for example, it shows that every Datalog program is the basis of a theoretically efficient SMT theory), and suggests the potential of using ASP to reason about Datalog programs.

Description

Other Available Sources

Research Data

Keywords

Datalog, logic programming, program analysis, program synthesis, SAT solving, SMT solving, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories