MS Expand Data Commons and Data Sharing
Leidos Biomedical Research, Inc., Frederick MD
Investigators
Abstract
The Beau Biden Cancer Moonshot Blue Ribbon Panel Enhanced Data Sharing Working Group recommended the creation of a data science infrastructure necessary to connect repositories, analytical tools, and knowledge bases and allow data to be aggregated, queried, analyzed, and visualized in unique and powerful ways within and across data types. In line with this vision, the NCI has created components - the Genomic Data Commons (GDC) and the Cancer Genomics Cloud Pilots - that define some of the core elements and capabilities necessary for realizing an NCI Cancer Research Data Commons, which itself is an important component of a National Cancer Data Ecosystem. Building upon the experience gained in developing the GDC and the Cloud Pilots and evaluating the gaps present in those initiatives relative to an NCI Cancer Research Data Commons, NCI is using that information to define a reusable, expandable framework for the Data Commons. This Commons Framework will include and support (1) secure user authentication and authorization; (2) data submission, including validation against data models and vocabularies; (3) domain-specific, extensible data models; (4) a mechanism for data and tool discoverability; (5) an API and container environment for tools and pipelines; (6) and a user workspaces for storing data, tools, and results.
View original record on NIH RePORTER →