Proto-OKN Theme 1: BioBricks-OKG An Open Knowledge Graph For Cheminformatics And Chemical Safety
Insilica, Llc, Bethesda MD
Investigators
Abstract
This NSF Proto-Open Knowledge Network Theme 1 project aims to enhance the open-source BioBricks system, transforming it into an effective knowledge graph backend to facilitate access to crucial chemical health and safety data, enabling the use of AI in developing new chemical testing and regulation approaches. Currently, the data ecosystem that spans health informatics, toxicology, and cheminformatics comprises numerous disjointed databases with inconsistent availability. It is challenging to access, process, and integrate data from various sources for robust model construction. The BioBricks Open Knowledge Graph (BioBricks-OKG) is designed to address this issue by semi-automating the harmonization of tabular data into a unified knowledge graph. This will enhance data sharing between databases through ontology alignment methodologies. Furthermore, BioBricks-OKG aims to scale these techniques to over 60 public health and cheminformatics databases, significantly improving data harmonization and accessibility. The BioBricks-OKG will greatly benefit the public health, medical, and life science fields by fostering the integration of data science and machine learning. Once operational, clinics, pharmaceutical companies, regulatory agencies, and bioinformatics Contract Research Organizations (CROs) can utilize the BioBricks-OKG knowledge graph for intelligent data queries, obviating the need for complex data repository navigation or redundant data extraction pipelines. The project leverages the BioBricks-AI framework and the project team's experience in building toxicology-focused knowledge graphs. The BioBricks-AI framework offers open-source repositories that transform health informatics databases into a distributable, serialized format for scalable data analysis. The work involves streamlining BioBricks-AI repositories and public health databases into a vast graph database. This database will link chemicals, their genetic, molecular, and cellular disruptions, health hazards, adverse outcome pathways, testing methods, and other entities relevant to chemical safety and regulation. The National Toxicology Program's Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM) will use the resulting BioBricks-OKG for designing, collecting information, and evaluating "New Approach Methodology" (NAM) test guidelines. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →