Flexible Statistical Modeling

$500,000FY2021MPSNSF

Stanford University, Stanford CA

Investigators

Abstract

Statistical learning techniques have made significant progress in the past 15-20 years. Some representative areas include neural networks, applied regression, classification, and clustering. As a result of these developments, a powerful collection of adaptive regression and classification techniques are now available and can be applied to a wide range of important science and engineering areas. Some typical applications include medical diagnosis, bioinformatics, chemical process control, and face recognition. The focus of this project is on high-dimensional statistics and data science. This work will help scientists working in biotechnology and other areas to interpret and uncover important patterns in large-scale data sets. This research will also help scientists and doctors discover the biological bases of many diseases, and improve prognosis and treatment selection for patients. The project will provide research training opportunities for graduate students. This project includes four main thrusts in the area of supervised learning. In the first thrust, the investigator will build feature-efficient, or lean, models that depend on only a small number of unique features. The investigator will develop COVID-19 case forecasting through customized training and collaboration with a team in the second thrust. In the third thrust, the investigator will develop a model-free approach to the challenge of local feature importance via building a convex region around a point for prediction. The fourth thrust focuses on improving the large-scale computation of l1- regularized models. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →