GGrantIndex
← Search

SHF: Small: Collaborative Research: Better Comprehension of Software Engineering Data

$242,433FY2010CSENSF

West Virginia University Research Corporation, Morgantown WV

Investigators

Abstract

The amount of data generated during the development of today?s software systems is staggering. It includes the source code, developer e-mails, bug information, testing results, analysis data, process information, requirements, etc. The size and complexity of this information make it impossible for developers to reason about it. Data mining techniques are a common solution to extract what is relevant to developers and managers. The success and quality of these software projects depends on the software engineers? ability to customize generic data mining algorithms to specific software engineering data. This project will produce tools and techniques that will allow software developers and managers to easily customize and apply data mining techniques to a variety of software engineering problems. Such solution will become more practical and will help many existing approaches to migrate from the research lab into industry. Under represented categories of students will participate in this research. The project will enhance the existing software engineering curriculum and facilitate the inclusion of data mining solution in the repertoire of future software engineering practitioners and researchers. Specifically, the project will improve the state of the art solution to three important software engineering tasks: concept location in software, software defect prediction, and development effort estimation. The project will produce an algorithm customization methodology and a framework that will be instantiated for a variety of combinations of data mining algorithm x software engineering task x software system data. The customization problem is framed and addressed as an optimization problem. The resulting customization agent will assist the software engineering user in efficiently selecting the best configuration, which includes a set of algorithms and their parameter values, customized for a particular task and software system. All tools and methodologies will be empirically evaluated in academic and industrial settings.

View original record on NSF Award Search →