CRII: III: Learning Predictive Models with Structured Sparsity: Algorithms and Computations

$132,024FY2020CSENSF

Southern Methodist University, Dallas TX

Investigators

Abstract

Recent advances in predictive modeling technology have greatly reinforced human decision-making processes by extracting hidden trends and gaining valuable knowledge from the past events. Predictive modeling for sparse representation is a fundamental methodology in machine and statistical learning that aims to produce robust outcomes by exploiting domain knowledge, formulating mathematical programs with observed data, and solving the problems with computational resources. For example, if experts believe there are some parts of data that are insignificant, the model should be able to detect and avoid involving such data to increase prediction accuracy. If features of the data possess hierarchical relationships, e.g.,availability of medical measurements depends on which tests were given to the patient, an accurate model must reproduce the structure for practical applications. Achieving such desired conditions through proper modeling is critical to integrate prior understandings of the problem, and adhere to procedural and operational restrictions. This project aims to expand current knowledge of predictive modeling with structured sparsity by introducing a unified framework for many existing problems and providing computational tools through mathematical optimization methodologies. The sparse patterns in the model variables can be formulated exactly by utilizing the discrete property of the indicator function, and approximately by using surrogate functions. The project aims to investigate effectiveness of imposing such conditions by enforcing them as hard constraints. The main objectives include 1) formulating existing problems as constrained optimization problems, which minimizes the model's loss with respect to the provided data while obeying pre-determined sparsity conditions, 2) developing efficient and robust algorithms that are capable of effectively handling resulting nonconvex constraints, and 3) studying numerical performance of the new method compared to the latest sparse modeling technologies used in practice. Based on the prior work, the investigator aims to develop deterministic and randomized algorithms that compute stationary solutions with desirable theoretical properties through iterative procedures. Specific research tasks include implementing the proposed method applied to simulated and real data, and investigating robustness of the model in terms of prediction accuracy, ability to identify significant components of data, and successful reproduction of desired sparse patterns in the repeated experiments. The outcome of this project including data and implemented products will be shared with the public through open-source communities and online repositories. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →