GGrantIndex
← Search

Collaborative Research: Geoinformatics: Facility: Paleobiology Database: Preserving and Presenting Ancient Data for Future Research

$428,774FY2020GEONSF

University Of Wisconsin-Madison, Madison WI

Investigators

Abstract

The Paleobiology Database (PBDB) is an open access, community-curated resource containing information on the temporal and geographic distribution of almost 1.5 million fossil occurrences, their taxonomic classification, and associated bibliographic information. The PBDB has become the go-to data resource for answering questions such as, “what fossils are found in my backyard”, “how has biodiversity changed over time,” and “what are the drivers and consequences of mass extinctions?” The PBDB has cyberinfrastructure that allows data to be accessed and used within educational, commercial, and research web, mobile, and analytical applications. This project will develop and deploy new state-of-the-art machine reading and learning tools to automate key steps in the process of finding and extracting PBDB-relevant data from published scientific papers and reports. The primary objective is to decrease the high cost of expert time and effort that is currently required in order to find and enter new data into the system. The automated reading and learning tools will primarily focus on paleontological and taxonomic literature. However, this new machine reading and artificial intelligence system, which is capable of locating and extracting specialized information and automating key steps in database construction, will address a general cyberinfrastructure challenge that has far-reaching applications across science and industry. This project will facilitate research that characterizes natural geological resources, improves stratigraphic correlation and paleogeographic reconstructions, calibrates molecular clocks, and facilitates paleoecological assessments of the effects of modern and ancient environmental change. The project will establish a first-of-its-kind, AI-driven pipeline that would connect a rapidly growing digital library and computing infrastructure, called GeoDeepDive, to the large expert community of PBDB users. This will serve as a template for deployment of other such pipelines for other user communities. As part of this process, the PIs will complete development and deployment of the PBDB data acquisition application program interface (API), which will modernize the PBDB data entry workflow and enable the development of third-party apps designed for entering and editing specific PBDB data types, such as taxonomic opinion and classification data. This will likely lower the barrier for participation in the process of PBDB data entry and editing, and thus enhance a crowd-source model for future PBDB growth and improvement. This project is jointly funded by the Geoinformatics and the Sedimentary Geology and Paleobiology programs in the Division of Earth Sciences. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →