GGrantIndex
← Search

Data Science and Sharing Team

$1,392,040ZICFY2023MHNIH

National Institute Of Mental Health

Investigators

Linked publications, trials & patents

Abstract

Data Sharing In January of 2023, the NIH implemented a new data sharing policy to promote scientific data sharing. As the Data Science and Sharing Team (DSST), we are well-positioned to advise and assist the NIMH IRP on preparing their data management and sharing plans and organizing their datasets to be shared to public repositories. In collaboration with Francis McMahon's group, the DSST has curated and uploaded whole exome sequencing and phenotypic data from 156 subjects with major depressive disorder as well as healthy controls (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs003329.v1.p1). This dataset is available on the NIHs genetic data repository dbGaP. Continuing our collaboration with Audrey Thurm's group, the DSST extended the data available for the Autism Subtypes Study to include data from 32 additional MRI sessions. The publicly available data can be found on the NDA (https://nda.nih.gov/study.html?id=1887) In collaboration with the Section on Neuroadaptation and Protein Metabolism, the DSST expanded the previously released dataset of cerebral protein synthesis (rCPS) into four separate datasets that include raw MRI and PET images, along with blood sampling data. These datasets also include structural MRI and PET preprocessed derivatives. The DSST also worked with Ellen Leibenluft's group to curate and share structural MRI (T1w and diffusion) and phenotypic data from 132 youth subjects with attention deficit/hyperactivity or disruptive mood dysregulation disorders. This dataset has been uploaded to the OpenNeuro repository and will be made public upon publication of the accompanying paper. Data Curation DSST continues to provide the IRP with access to multiple large and publicly available datasets. We maintain a comprehensive list of these datasets on our website (http://cmn.nimh.nih.gov/dsst). To date, we maintain over 120,000 MRI scan sessions across 31 different datasets. This year our most requested datasets were the UK Biobank, Human Connectome Project, Philadelphia Neurodevelopmental Cohort, Child Mind Institutes Healthy Brain Network. One complicating aspect of working with multiple large datasets is that each dataset formats phenotypic data in idiosyncratic ways. To this end, the DSST, in collaboration with members of the Machine Learning Team, Section on Functional Imaging Methods, and external collaborators, has developed open-source software that harmonizes phenotypic data format from our most requested datasets to accelerate multi-dataset studies. DSSTs Eric Earl presented this software as a poster during the Organization for Human Brain Mapping annual conference (https://osf.io/vn4yq/) and the software is publicly available on GitHub (https://github.com/nimh-dsst/dataset-phenotypes). Training The DSST is continually providing ad hoc training while consulting with researchers and trainees throughout the NIH intramural program. This fiscal year, it also offered researchers seven structured training opportunities, detailed below: 1. The DSST continues to host our weekly Lunch & Learn series in which speakers present on a topic relevant to the DSSTs mission. This year, topics included data visualization, software design, GitHub actions, among others. 3. In December of 2022, Dustin Moraczewski and Arshitha Basavaraj organized and hosted the DC chapter of the annual BrainHack Global yearly hackathon on the NIH campus. 4. In August of 2022 and in February of 2023, Adam Thomas and Eric Earl organized two additional BIDS Validator Coding Sprints which were attended by scientists and developers throughout the world. The prototype from the previous sprint was merged with the main code branch and the schema was updated. It is now possible to use the schema-based validator for datasets uploaded to OpenNeuro and this will soon become the default. The report for the third and final sprint is available at https://bit.ly/2023BidsSprint. 5. Arshitha Basavaraj and Eric Earl designed a workflow for anonymizing structural MRIs, which is available on GitHub (https://github.com/nimh-dsst/dsst-defacing-pipeline). The DSST provided training to multiple groups throughout the year on the implementation of this pipeline. 6. The DSST has also created documentation that provides tutorials and explanations of common hurdles to overcome when working with shared data (https://dsst.readthedocs.io/en/latest/). 7. Last year, the DSST curated and uploaded the NIMHs Healthy Volunteers Study to OpenNeuro (https://openneuro.org/datasets/ds004215), which has been downloaded 290 times. As a result of this curation, Arshitha Basavaraj presented a poster at the annual Organization for Human Brain Mapping meeting to educate researchers on lessons learned and best practices in curating a dataset for sharing. The poster can be found on the Open Science Framework website (https://osf.io/yk7gw/). Collaborations In our collaboration with Armin Raznahan and the Section on Developmental Neurogenomics, we continue to assist with processing structural brain images of over 40,000 subjects within the UK Biobank to examine complex relationships between genome and brain structure. A paper focused on sex differences in the relationship between genome and brain structure has been recently submitted for review, with a preprint on medRxiv (https://doi.org/10.1101/2023.08.09.23293881). Our BRAIN-grant funded collaboration with Drs. Robert Innis, Gitte Knudsen, Melanie Ganz, and Cyril Pernet to create the OpenNeuroPET archive continues to be productive. The repository is now live and we are focused on improving specification for BIDS derivative both from PET and other modalities. In June, Adam Thomas attended a workshop in Copenhagen focused on finalizing the BIDS Derivative extension. A guidelines document from that workshop is available here: https://bit.ly/BIDS_derivWrksp2023. In collaboration with the Machine Learning Team, we are exploring how latent variables derived from harmonized phenotypic data can improve the prediction of whole-brain functional connectivity. In addition, a related project examines the interpretability of fMRI decoding using neural networks. This paper is now published in Aperature Neuro (https://doi.org/10.52294/001c.85074). The DSSTs Dustin Moraczewski collaborated with Gang Chen and Paul Taylor of the Statistical and Scientific Computing Core (SSCC) to use single trial responses from twins in the ABCD dataset to examine statistical models of heritability. This paper has been submitted for review and a preprint can be found on bioRxiv (https://www.biorxiv.org/content/10.1101/2023.06.24.546389v1). In another project with the SSCC and Dr. Jo Etzel at Washington University, Arshitha Basavaraj and Dustin Moraczewski contributed to the editorial that accompanies a Frontiers in Human Neuroscience special issue on quality control in fMRI (https://doi.org/10.3389/fnins.2023.1205928). In a collaboration with Dr. Mark Histed and the Unit on Neural Computation and Behavior, we applied for and received a one-year Kavali seed grant to write an extension to the NeuroData Without Borders (NWB) standard to allow for the storage of holographic photostimulation patterns. This work is now complete and the extension will soon be submitted to the NWB module catalog at https://nwb-extensions.github.io/ COVID-19 In collaboration with Francisco Pereira and the Machine Learning Team and Joyce Chung in the Clinical Directors Office, Carl Harris completed the exploratory portion of their study of predictor of mental well-being during the COVID-19 pandemic. This work was preregistered on OSF (https://osf.io/atxcg) and is now under review for publication.

View original record on NIH RePORTER →