ITR: Computational Induction of Scientific Process Models
Institute For The Study Of Learning And Expertise, Palo Alto CA
Investigators
Abstract
This interdisciplinary, interinstitutional collaborative research effort aims to develop a framework that unifies two separate but central themes in information technology -- computational simulation of models to explain important phenomena and computational induction of knowledge from observed regularities in data. Unlike most previous work in machine learning and data mining, the approach emphasizes methods that generate knowledge in established scientific formalisms, incorporate domain knowledge where possible, focus on causal and explanatory models, address induction from observational time-series data, and are embedded in a simulation environment which scientists can use for model development. The research revolves around a new class of models that consist of interacting quantitative processes and the problem of inducing such models from time-series data. Computational challenges that will be addressed include reducing overfitting and variance, inducing conditions on processes, handling large, heterogeneous data sets with missing values, and scaling to complex models. The resulting algorithms will be included in a trainable simulation environment that lets users construct models manually or induce them from data, then simulate their behavior. Experimental evaluation will involve both Earth Science observations from the Ross Sea and synthetic data. The trainable simulation environment should let Earth scientists search the space of candidate models systematically, producing more accurate models in much less time. Moreover, the novel computational methods should aid model construction in other fields like systems biology and engineering. Both the environment and sample models will be utilized in courses and accessible through a Web site. More information is available at http://www.isle.org/process.html.
View original record on NSF Award Search →