SHF:Small:Scalable and Precise Program Analyses via Linear Conjunctive Language Reachability

$497,896FY2018CSENSF

Georgia Tech Research Corporation, Atlanta GA

Investigators

Abstract

Static program analysis provides foundational and practical techniques to help build reliable and secure software. Context-free language (CFL) reachability has been widely adopted for specifying program analysis problems. However, little foundational progress exists on advancing the CFL-reachability framework itself. To support precise and scalable program analyses, an expressive, accessible class of formal language reachability is needed to bridge fundamental formal language research and practical analysis-based tool development. The project's novelties are twofold: a new powerful formalism for specifying program analyses, and algorithms and techniques for realizing a practical framework based on this formalism. The project's impacts are deepened knowledge on and improved capabilities for building precise and scalable static analyses, as well as practical analyses for improving software reliability and security. This project will explore linear conjunctive language (LCL) reachability as a new static analysis formalism. In contrast to CFLs, LCLs are closed under all set-theoretic operations and can also be efficiently recognized in quadratic time. A significant number of advanced program analyses need to match properties described by multiple CFLs simultaneously. LCLs can precisely express many such properties, while CFLs cannot because they are not closed under intersection. Thus, LCL reachability offers a novel perspective in specifying and realizing program analyses. The investigators' initial work on LCL-reachability has shown considerable promise, leading to both more precise and orders of magnitude more scalable alias and taint analysis, two widely-used analyses. This project aims to fully exploit LCL-reachability's potentials by developing a unified solution for specifying program analysis problems in LCL and implementing novel data structures that support efficient LCL-reachability algorithms. It focuses on (1) theoretical development of the LCL-reachability formulation, (2) efficient algorithms for computing LCL-reachability, and (3) generalizing to practical program analyses. If successful, this project will significantly advance the state-of-the-art in software analysis and verification. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →