LICI Microbiome and Genetics Core

$2,106,010ZICFY2023CANIH

Division Of Basic Sciences - Nci

Investigators

Linked publications & trials

Abstract

The Microbiome and Genetics core (MGC) of the Laboratory of Integrative Cancer Immunology (LICI) runs its microbiome facility in Building 37 of Bethesda with a team currently consisting of two research technicians, two bioinformaticians, two scientists and one postbac student. The primary function is to meet the growing interest and challenges of characterizing the role of the microbiota both in cancer and inflammatory processes as well as in general health. Having established reliable and reproducible methods to isolate and characterize nucleic acids of microbiota isolated from fecal sources, the core has worked with a range of source materials and PIs to help effectively determine changes in microbial representation between experimental samples. The core has established its process repertoire so that four distinct experimental characterizations are offered, comprising ribosomal amplicon sequencing (16S for bacteria or ITS for fungi), whole genome sequencing of microbial isolates, shotgun metagenomics and shotgun transcriptomics. DNA has been extracted from source organisms such as human, mouse, macaque and drosophila and from source materials as varied as fecal pellets, anal and vaginal swabs, intestinal tissue and saliva. The expansion of services offered beyond amplicon sequencing enable the core to look at potential metabolic pathway changes induced by changes in gene content and composition of the microbiota. Robotic sample preparation platforms (Eppendorf 5073 and 5075) are used to maximize throughput and reproducibility, both for nucleic acid isolation and for barcoded library preparation. Quantification is accomplished using qPCR or spectroscopy. Following purification, barcoding and quantification, an Illumina MiSeq is used to sequence amplicons of 16S rRNA genes. For genomic approaches, the same DNA isolation process is used and as little as 1ng of DNA is subjected to breakage and library preparation by transposon driven 'tagmentation'. Whole genome sequencing from isolates is done on the Illumina Miseq platform and shotgun metagenomes of the microbiota are run on the higher capacity Nexseq in the core or HiSeq and NovaSeq platforms elsewhere. In the past year the core has played a central part in identifying the role of microbiota in determining outcomes to immune checkpoint inhibitor therapy in melanoma. Samples from more than 30 projects have been processed from inside LICI and NCI as well as for collaborators from other NIH institutes and more than 1Tb of sequenced base pairs of data generated and analyzed from these platforms. Across the projects, different challenges ranging from how to isolate DNA from high or from lower bacterial biomass sources, how to partition analyses from different sources and which treatments maximize the signal to noise ratio of experiments have been met successfully. We are handling samples associated with both clinical and with basic scientific research. We continue to utilize both Illumina's cloud server as well as a backup system at the computer center of FNLCR to meet the challenges of storage, delivery and backup of large amounts of information. We continue to make available two analytical approaches to determining microbial abundances for 16S amplicons, the Qiime2 and mothur platforms and have tested them extensively. Our favored pipeline to take advantage of components of each but primarily Qiime2 is used. The analyses are also limited by the quality of databases of ribomsomal RNA. We continue to develop a database of fully vetted, high quality rRNA sequences for use in identifying components of the microbiome in samples. Standard outputs generated by MGC bioinformatics show taxonomic representations for all samples in case-control studies (alpha and beta diversities), unifrac distances estimated between samples and sample differences illustrated using principal component analyses (PCA) as well as statistical evaluation of differences. Assembly, analysis and annotation for shotgun metagenomics is far more complex than amplicon based characterizations. For these more challenging procedures, specialized bioinformatic pipelines designed to interrogate the complex data are used, particularly our in-house developed JAMS package (but also MetaPhlAn4). JAMS works from assembled contigs to build up genomes de novo, whereas MetaPhlAn uses homology mapping of reads against a database of known genomes. Standard output for JAMS involves taxonomic representations (starting with the proportion of host DNA), rarefaction analyses, breakdowns of metadata into all defined categories and analysis by PCA genomic annotation of assembled contigs, gene and gene family (using Prokka) and biochemical pathway analyses. JAMS output can be seamlessy passed to our analytical network software (TkNA) to identify potential causal relationships among microbes and host genes and pathways. The work of the MGC has been recognized by co-authorships with collaborators in 7 papers over the past year. Furthermore the descriptions and codes for the in-house bioinformatic pipelines JAMS and transkingdom network analysis (TkNA) have been published. Continuing work includes characterization of shotgun metagenomes in checkpoint inhibition therapy of cancer; therapeutic responses to fecal transplantation; microbiota in human esophageal biopsy; microbiota and bile acid metabolism in liver cancer; microbiome characterization in hematopoiesis reconstitution; microbiota from human oral samples related to outcomes from transplantation; characterization of salivary microbiota in hepatitis infection; and shotgun metagenomic analysis of human tooth microbiota in two distinct monogenic neutrophil deficiency syndromes. Additional projects include microbiome characterization in association with neurological defects in Drosophila. Large studies the MGC has undertaken in the past years (a mother-infant nutrition cohort as well as a melanoma cohort) indicate that the experimental and analytical throughput of the MGC is sufficient to handle data from cohorts in excess of 1000 indididuals. We also continue to support analysis in genetics of HLA expression and the role this plays in cancer, autoimmunity and infectious disease outcomes. We have been involved in the production of papers showing the extent of tapasin dependency in HLA expression as well as the distribution of constant regions of IgG genes. These studies build upon our work in helping to show that HLA expression affects outcomes infectious and autoimmune disease such as HIV and infection and well as autoimmune related conditions such as Crohn disease or even transplant rejection. Ongoing investigations concern the role of neoantigen recognition by HLA alleles in immunotherapy outcomes as well as characterization of variegated KIR expression in Natural Killer cells using single cell RNA sequencing data. The genetic elements that control immune gene expression are of considerable interest and we continue to support groups working on their characterization. We have developed a novel peptide-based metric to predict the extent of functional complementarity of HLA alleles in heterozygotes. Testing has been done in HIV cohorts but the methodology has applications in other infectious disease and potentially can be modified to associate with outcomes in cancer. The work is currently under review for publication.

View original record on NIH RePORTER →