Collaborative Research: SHF: Medium: Natural Language Models with Execution Data for Software Testing

$300,000FY2023CSENSF

University Of Illinois At Urbana-Champaign, Urbana IL

Investigators

Abstract

Natural Language Processing (NLP) models have proven useful for various software engineering tasks, including code completion, comment generation and update, code review generation, and clone detection. Despite the importance of software testing in industry, there has been little work on using these Artificial Intelligence (AI) models for developing and maintaining test code, which is a key part of software testing in the real world. Test code differs in multiple ways from regular code: (1) Test code is structured in a specific way, with steps for setting up a test environment and comparing expected results; (2) Test code has richer context, such as the specific methods and code it is testing (code under test); (3) Test code uses different code elements than the code under test, i.e., it has a different control structure; (4) Test code has specific input values and expected results; (5) Unlike regular code, test code can be readily executed. The goal of this project is to increase the productivity of software engineers via NLP models that simplify the development and maintenance of tests (NLP4Test). Specifically, tasks include test generation and completion, test update (when the underlying code changes), and automatically migrating tests across different programming languages. This project explores testing both general codebases and emerging machine learning (ML) applications. The project targets a novel domain -- NLP4Test, and this domain requires innovative NLP models. The outcome of this project will include novel techniques, implementations of these techniques, and extensive evaluations on open-source projects. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →