I-Corps: Translation potential of using generative models for automatic test question generation and evaluation for educational assessment applications

$50,000FY2024TIPNSF

University Of Virginia Main Campus, Charlottesville VA

Investigators

Abstract

The broader impact of this I-Corps project is the development of an intelligent test question development assistant, which could potentially support K-12 test developers by automatically generating high-quality questions and responses. This solution would benefit the testing and educational support industry, including but not limited to K-12 testing companies, language testing agencies, online education platforms, professional certifiers and licensure groups, and classroom teachers. The time and cost of test development would be significantly reduced. The technology will help advance the development and adoption of generative artificial intelligence techniques in educational measurement and assessment, contributing to the next generation of artificial intelligence tools for education. This I-Corps project utilizes experiential learning coupled with a first-hand investigation of the industry ecosystem to assess the translation potential of the technology. This solution is based on the development of a generative artificial intelligence tool for test question development and evaluation for educational assessment applications. Test question development has been recognized as an extremely time-consuming, labor intensive, and expensive process in traditional paper-and-pencil testing and computerized adaptive testing. Automatic generation and evaluation of test questions presents a promising solution and has attracted considerable attention in the past decade. This automatic question generation and evaluation system leverages customized large foundation models to generate question for various educational assessment tasks, such as K-12 standardized tests and language tests. In addition, the system is able to generate high-quality test questions that are well aligned with user specifications, such as test blueprints, fairness, and difficulty levels. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →