GGrantIndex
← Search

EAGER: The Exploration of Geometric and Non-Geometric Structure in Data

$150,000FY2015CSENSF

Ohio State University, The, Columbus OH

Investigators

Abstract

The goal of machine learning is to extract useful information from data. While the amount of data available to researchers for analysis is ever increasing, much of the data are unlabeled, meaning that the data come without labels indicating their associations with specific learning tasks. Thus understanding unsupervised inference is one of the key problems in machine learning. In addition, data annotated for a certain task may be difficult to use even for tasks only slightly different. This is known as the problem of transfer learning in the literature. To make the most of the available information, machine learning algorithms need to to obtain, analyze and use realistic structural assumptions about the data based on rigorous mathematical models. The proposed work offers students working on this project an opportunity to be exposed to a broad spectrum of topics including machine learning, statistics, geometry and applied mathematics. Students will learn a combination of theory and algorithm development skills in machine learning and data analysis. The results of this work will be disseminated to the broad scientific community through publications in journals, conferences, presentations in various venues, including tutorials and course notes. The material related to this project will be incorporated in PI?s and co-PI's courses. The PIs will also create summer research and practice opportunities for interested undergraduate students in research related to the project. In this EAGER project an exploration of two types of structural assumptions on the data will be started. Geometric structures in data will be explored, such as hierarchical structure of clusters and density. The use of partial orders for non-geometric data will be explored, based on probabilistic models for partial rankings an orders for problems such as zero-shot learning and transfer learning. By approaching the problem of inference from data within these frameworks, output of this project will be a stepping stone to the challenges of machine learning and to developing efficient algorithms to advance the state-of-the-art both in theory and practice. It is argued argue that these models and the proposed mathematical/algorithmic machinery are amenable to theoretical analysis and will provide insight into properties of real data. Results from the proposed work will broaden the scope of machine learning methods to analyze more complex data in a theoretically well-founded manner.

View original record on NSF Award Search →