Statistical Learning for Innovative Assessment
Columbia University, New York NY
Investigators
Abstract
This research project will develop statistical methods for modern cognitive assessment. With the increasing use of computer-based testing, a variety of high-dimensional and complex structured data sets have been collected. The project will focus on statistical modeling and inference for large-scale data sets of complex structures. Specific topics to be addressed include the analysis of process data, adaptive learning through a reinforcement learning framework, and the development of computational methods for the models to be developed. The results of this research will provide a deeper understanding of the complex data structures collected in technology-rich interactive tasks. The project will shed light on items in learning and assessment environments that are delivered online both in client-server constellations and in stand-alone applications. The project will provide guidelines to improve item quality with a focus on more innovative item types, such as those in scenario-based and simulation-based environments for the assessment of students' knowledge and skills in the STEM fields. Educational researchers will be provided with tools to identify patterns in high-dimensional data and sequence data. Students in instructional and interventional programs will benefit from this research, especially in the STEM fields that are increasingly defined by digital media and technology-based interaction and communication. Recent large-scale computer-based assessments have developed a number of interactive problem-solving items and collaborative problem-solving items. The investigators will develop statistical methods for the analysis of these new items. The investigators will concentrate on several aspects that are very challenging in the analysis of modern computer-based assessment; specifically, they will focus on: 1) predicting human behavior by means of modern machine learning techniques; 2) extracting latent structure and graphical structure for process data collected by interactive problem-solving items through event history analyses; 3) providing personalized learning material through a reinforcement learning framework; and 4) developing numerical methods to optimize high-dimensional functions either stochastically or deterministically. The models to be developed will combine latent variable and graphical approaches as well as deep-learning techniques for high-dimensional data. For modeling process data, the investigators will employ recent advances in modeling and segmenting techniques for natural language processing. For computation, the investigators will develop adaptive Robbins-Monro stochastic approximation. Optimization algorithms will be developed using recent advances in numerical methods. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →