GGrantIndex
← Search

The next iteration of the AMP-T2D Knowledge Portal

$224,184UM1FY2023DKNIH

Broad Institute, Inc., Cambridge MA

Investigators

Linked publications & trials

Abstract

Project Summary In this supplement, we propose to make the LD-score regression (LDSC) software package – a widely used tool by the human genetics community that has been cited thousands of times – more compliant with the goals of open science. In our original project, UM1DK105554 (“The next iteration of the AMP-T2D Knowledge Portal”), we proposed to continue development of the Common Metabolic Disorders Knowledge Portal (CMDKP) by enhancing its infrastructure to aggregate, analyze, and visualize genetic and genomic datasets and results. As part of this work, we have added LDSC to the CMDKP. LDSC calculates statistics between pairs of GWAS datasets or between a GWAS dataset and a genomic annotation, which we use to help users query and interpret data in the CMDKP. In the process of adding LDSC to the CMDKP, we made several modifications to its codebase to reduce computational overhead. These in turn identified three additional improvements that could be made to the LDSC software package to make it more compliant with the goals of open science. First, by implementing batch dataset processing natively within the LDSC codebase, we would make it possible to easily scale LDSC on the cloud and enable users to apply it to their own GWAS data across many traits or annotations. Second, by implementing a software service and API to which users could upload their own GWAS data, and then obtain cross-trait correlations or annotation enrichments across all datasets in the CMDKP, we would provide a new dissemination channel for LDSC targeted at users who do not want to download any data or code. Third, by refactoring LDSC into more a modular set of libraries within a modern language, we would make it easier to maintain, extend, and reuse. Collectively, these improvements will advance open science by making a widely used software tool (a) parallelizable, cloud-ready, refactored, and more extensible; and (b) more reusable through an easy-to-use new software service and API. They will advance the goals of our original project by providing a new mechanism to interact with data in the CMDKP and contribute new datasets to it.

View original record on NIH RePORTER →
The next iteration of the AMP-T2D Knowledge Portal · GrantIndex