CRII: SHF: Machine-Learning-Based Test Effectiveness Prediction

$174,150FY2016CSENSF

University Of Texas At Dallas, Richardson TX

Investigators

Abstract

Test effectiveness, which indicates the capability of tests in detecting potential software bugs, is crucial for software testing. More effective tests can detect more potential bugs and thus help prevent economic loss or even physical damage caused by software bugs. Therefore, a huge body of research efforts have been dedicated to test effectiveness evaluation during the past decades. Recently, mutation testing, a powerful methodology that computes the detection rate of artificially injected bugs to measure test effectiveness, is drawing more and more attention from both the academia and industry. Various studies have shown that artificial bugs generated by mutation testing are close to real bugs, demonstrating mutation testing effectiveness in test effectiveness evaluation. However, a major obstacle for mutation testing is the efficiency problem ? mutation testing requires the execution of each artificial buggy version (i.e., mutant) to check whether the test suite can detect that bug, and which is extremely time consuming. Therefore, a light-weight but precise technique for measuring test effectiveness is highly desirable. The approach is to automatically extract test effectiveness information (e.g., mutation testing results) from various open-source projects to directly predict the test effectiveness of the current project without any mutant execution. More specifically, the PI proposes to design a general classification framework based on a suite of static and dynamic features collected according to the PIE theory of fault detection. Furthermore, this research will explore judicious applications of advanced program analysis, machine learning, and software mining techniques for more powerful feature collection, more active learning, as well as more comprehensive training data preparation. The proposed approach will result in efficient but precise test effectiveness evaluation for projects developed using various programming languages and test paradigms, which is crucial for high-quality software. Furthermore, the training of the classification models will require to collect various basic testing, analysis, and mining information from a huge number of open-source projects, and thus may also benefit a large variety of software testing/analysis/mining techniques that explore open-source software repositories.

View original record on NSF Award Search →