SaTC: CORE: Medium: Secure and Formally-verified Low-level Languages

$1,200,000FY2023CSENSF

University Of Pennsylvania, Philadelphia PA

Investigators

Abstract

Our modern computing infrastructure, including the Internet and multitude of applications that run on our phones and in data centers, relies on many low-level software systems to function correctly. Low-level software includes things like operating systems and the tools, such as programming language compilers, that are used by software developers to build those applications. Flaws in the design of low-level software, or bugs in its implementation, can lead to system crashes, security vulnerabilities, or incorrect results that can be very costly to individuals, businesses, and scientific or educational institutions. The research conducted as part of this project will investigate how to improve the security and reliability of low-level software. The research focuses on a specific piece of the modern software infrastructure (something called LLVM IR, an industrial-strength bit of compiler infrastructure that is widely used in industry and academia), and develop a formal, computer-checkable, mathematical model of its behaviors. That model will be used to uncover flaws in the infrastructure, and as an aid to implementing correct systems using it. Because such systems are ubiquitous, improving their security and reliability can potentially have significant positive impact for society as a whole. This project will also develop educational resources suitable for teaching the underlying theory and techniques to software developers who use this infrastructure. The project will build on and extend the capabilities of Vellvm--the "Verified LLVM IR"--a formalization of the LLVM compiler infrastructure implemented in the Coq interactive theorem prover. Because the LLVM ecosystem supports many source languages (C, C++, Rust, Haskell, among others) and target platforms (including almost all modern processors), it is a natural fulcrum to amplify the impact of formal modeling and verification efforts. The research conducted here will have three main thrusts: (1) Improving Vellvm's fidelity and completeness with respect to the LLVM IR by extending the suite of supported LLVM IR constructs, developing a new memory model, and concurrency semantics that enable optimizations unavailable previousl; (2) Designing abstractions that facilitate relational reasoning about LLVM IR programs. These abstractions will take the form of a collection of domain-specific logics for reasoning about LLVM IR code at various levels of abstraction; (3) applying the Vellvm framework to build tools that help identify undefined behaviors and mitigate security vulnerabilities in LLVM IR programs. To achieve these goals, the proposed research will develop novel reasoning and proof automation techniques that are applicable to low-level program semantics. All of the developments will be implemented and verified using the Coq interactive theorem prover, yielding a high-degree of confidence in the results. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →