GGrantIndex
← Search

EarthCube Data Capabilities: Expanding the Ocean Protein Portal Capabilities for Use in Biochemical Research and Education

$599,989FY2020GEONSF

Woods Hole Oceanographic Institution, Woods Hole MA

Investigators

Abstract

This project aims to expand the functionality of the Ocean Protein Portal (OPP) for use in research and education on ocean biochemistry. The Ocean Protein Portal prototype was designed to allow a broad range of scientists and students to discover answers to the questions: 1) “Where is my protein of interest in the oceans”, 2) “Who makes the protein?” through least common ancestor analysis, and 3) “How much is there?”. By making ocean protein datasets accessible and searchable to broad multi-domain communities including biological and chemical oceanographers, geobiologists, microbiologists, biochemists, and bioinorganic chemists, our understanding of the oceans and of microbial biochemistry will be improved. Moreover, these large datasets have the potential to provide future scientists with an important record of environmental change, and hence the portal is serving to capture, organize, and share these data to enable long term ocean change capabilities. Additional benefits of this project include contributing to community building within the ocean metaproteome community by providing a data repository and motivating ongoing efforts to improve data quality and standards. Educational use will be developed through collaborations with teachers and professors by the creation of educational modules for students learning about chemical reactions. Technical goals of this project include specific improvements to the Ocean Protein Portal such as expanding search capabilities, adding the ability to serve new metaproteomic data types, implementing a Knowledge Graph system for increased interoperability and sustainability, creating an API for machine accessible searches, furthering connections to external data and enhancing visualization capabilities, creating reproducible and citable search results, and development of automated ingestion with immediate QC capability. The transition of the OPP to a Knowledge Graph will further facilitate connections with domains outside marine ecology, as the graph will link out to other data resources in the environmental sciences and biology. In addition, an OPP Knowledge Graph would also provide a data structure useful to the field of computer science as this data structure is immediately available to applications of Machine Learning and Artificial Intelligence thus advancing the potential for novel machine-assisted scientific discovery. The researchers will conduct tutorials (virtual or in person) for participants to learn about the ocean metaproteomics datatype, how to use the OPP interface, and how to pull data from the OPP and plot it within the Jupyter notebook environment (using Python, Matplotlib, Bokeh, and Binder). This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →