CRII: AF: Enriched Topological Summaries for Inverse Problems
Suny At Albany, Albany NY
Investigators
Abstract
With Big Data comes the need for succinct data summaries that are easy to visualize and understand. Topological Data Analysis (TDA) offers a particular suite of computational tools for summarizing and visualizing shape in high-dimensional data sets. These tools have been used in recent years to guide new discoveries in cancer research, neuroscience, materials science, image analysis, and many areas where shape, broadly construed, is of importance. However, by summarizing data in a succinct way certain important differences between data sets can often go undetected. This project provides a new mathematical framework for precisely measuring how lossy the most popular methods in TDA are. By developing new ways for quantifying how large-scale differences go undetected, the project offers enrichments of current topological methods to obtain novel data science tools with greater distinguishing power. The awarded funds will go primarily to fund a graduate student to aid the PI in basic research and development of these enriched topological summaries. Computational experiments on data sets of public interest, e.g. time-varying socio-economic indicators and weather data, will be carried out to test performance of these tools against current state of the art methods. Additionally, the PI will incorporate these novel methods into the data science curriculum at the PI's host institution and recruit students from historically under-represented groups to be trained as the nation's next generation of data scientists. This project is an ambitious extension of earlier work undertaken by the PI and provides a targeted attack on the inverse problem for the main objects of study in topological data analysis: the merge tree, which tracks how connected components of the sub-level set of a function evolves; Reeb graphs, which tracks connected components of the fiber of a function; and the barcode/persistence diagram, which is a collection of intervals/points in the plane that represent an algorithmic pairing of critical points of a function on a space. The PI showed in earlier work how merge trees determine the associated barcode and provided a precise enumeration of how many distinct merge trees have the same barcode. By identifying functions on the real line that are related by an orientation preserving coordinate transformation, the PI showed how a novel enrichment of the merge tree---the chiral merge tree---faithfully captures these equivalence classes of functions and offers exponential distinguishing power over the barcode for time-series analysis. This project aims to carry out similar analysis for functions on surfaces, with an eye toward improving the classification performance of persistent homology in image analysis, and for Reeb graphs, to better integrate with level-set persistent homology. By carrying out a careful study of inverse problems in each of these settings, novel enrichments analogous to the chiral merge tree will be developed. Additionally, to make these enriched topological summaries useful for data classification tasks, novel metrics will be defined for comparing each of these summaries and algorithms will be developed for the efficient computation of these metrics. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →