CAREER: Scalable Spatial Data Science on User-generated Data
University Of California-Riverside, Riverside CA
Investigators
Abstract
Nowadays, hundreds of millions of human users use the world wide web every day. These users generate tremendous amounts of data related to all aspects of life and contain a lot of information about local societies, human activities, and social behavior. By nature, such human-generated data have a spatial aspect, as the geographic location is inherent in many human activities. This makes such data a rich and up-to-date source for social scientists to study different aspects of modern societies to improve people’s life. However, the excessive amount of such data makes it computationally challenging to process complex analyses and extract meaningful insights at a large scale. The project innovates new technology to analyze spatial aspects of user-generated data at a large scale. This project innovates novel scalable data management techniques, especially query processing techniques, to support spatial data science on large user-generated datasets. The project supports two families of queries that are widely used for spatial analysis of user-generated data: spatial estimation queries and spatial grouping queries. The proposed research on spatial estimation includes learning-assisted modules that incorporate machine learning models to improve spatial scalability and accuracy. The proposed research on spatial grouping scales up the grouping of various spatial data types, including points, lines, and polygons, to provide a variety of building blocks that support various applications. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →