Nonlinear Partial Differential Equations, Monotone Numerical Schemes, and Scaling Limits for Semi-Supervised Learning on Graphs
University Of Minnesota-Twin Cities, Minneapolis MN
Investigators
Abstract
Machine learning aims to harness the power of data to train algorithms to make predictions and perform complex tasks in science, engineering, medicine, and everyday life. Despite the wide success of machine learning, the growing size of typical data sets is creating serious computational challenges, partially due to our incomplete understanding of how algorithms work in the big data regime. The first goal of this project is to address these issues within a branch of machine learning called graph-based semi-supervised learning, which has found applications in problems such as protein sequencing, website classification, and speech recognition. The investigator analyzes some recently-proposed algorithms for semi-supervised learning. A deep mathematical understanding of these algorithms explains why they work and, more importantly, when they will fail. The investigator uses these insights to develop new, fast, and more efficient algorithms for semi-supervised learning. The second goal of this project is to develop and study new algorithms for curvature motions of curves and surfaces that are stable, efficient, and convergent. Curvature motion has found many applications in science and engineering, including materials science, computer vision, image processing, and recently in data science and machine learning. Curvature motion describes, for example, the motion of soap bubbles as well as the evolution of polycrystalline materials (such as metals and ceramic) during the manufacturing process. Because both machine learning and curvature motions have very broad applications across science and engineering, any improvement in algorithms and in our understanding of them could have broad societal impacts. The first objective of this project is to study rigorously a recently-proposed algorithm for graph-based semi-supervised learning based on Lp-Laplacian regularization. In particular, the investigator proves that the learning algorithms have continuum limits that correspond to solving a weighted p-Laplace equation in the viscosity sense. In the limit of small labeled data and infinite unlabeled data, it has recently been conjectured that Lp-Laplacian regularization is ill-posed when p is smaller than the intrinsic dimension of the data. The project aims to settle this conjecture rigorously. The investigator uses these continuum limits to study the relationship between the fraction of labeled data and the performance of the algorithms, and to develop new and more efficient computational algorithms based on these insights. The second objective of the project is to develop and analyze a new class of monotone finite difference schemes for curvature-driven motions of curves and surfaces, for which rigorous convergence proofs are available. The investigator has recently discovered a general technique for constructing monotone finite difference schemes for a wide class of curvature motions of curves and surfaces. He implements the new schemes for a variety of motions, experimentally tests convergence rates, and rigorously proves convergence to the viscosity solution using the Barles-Souganidis framework. Monotonicity of the new schemes allows for the development of fully implicit and unconditionally monotone time-stepping schemes, and guarantees the approximations are capturing the correct continuum dynamics.
View original record on NSF Award Search →