GGrantIndex
← Search

Exploration of cloud computing for CAZyme research

$219,316R01FY2023GMNIH

University Of Nebraska Lincoln, Lincoln NE

Investigators

Linked publications & trials

Abstract

PROJECT SUMMARY Title: Exploration of cloud computing for CAZyme research Our R01 parent project (R01GM140370) intends to develop four bioinformatics tools for automated annotation of CAZymes (Carbohydrate Active Enzymes) and CAZyme Gene Clusters (CGCs) in human gut microbiome. These automated tools will enhance: (i) the basic biomedical science to characterize new polysaccharide (or glycan) metabolic enzymes and polysaccharide utilization loci (PULs, gene clusters with known carbohydrate substrates) in the human gut microbiome, and (ii) the emerging personalized nutrition practice (e.g., using gut microbiome sequencing to infer if a person is a responder to certain dietary glycans or prebiotics). In the past two years, we have developed dbCAN3 (Aim 1 of R01) and dbCAN-seq (Aim 3 of R01), one web server and one online database, to allow users submit their genomic data of any microbiomes for automated CAZyme, CGC, and glycan substrate annotation. Both websites are now hosted on our lab’s standalone desktop server (a six-year-old computer with 16-core/32-thread CPU), which is not a secure and sustained solution and cannot meet the increasing demand from users who routinely submit jobs to our servers. For example, our popular dbCAN2 web server (the second version of dbCAN that started in 2012) processed over 35,000 user submitted jobs in 2022 all by this desktop computer. Therefore, challenges/risks exist that may disrupt the popular service we provide to tens of thousands of microbiome users all over the world and additional support is requested to explore moving dbCAN3 to a cloud computing platform, e.g., Amazon Web Services (AWS). These challenges include: (i) our local desktop server is no longer able to meet the continuously growing job submissions, (ii) the server built in 2017 is already out of warranty, (iii) the server have been frequently reported by the University IT service department to have numerous security vulnerabilities due to its old operating system and software system. Therefore, the major goal of this supplement grant proposal is to explore and test the application of Amazon Web Services (AWS) to support the CAZyme bioinformatics tool development objective of our R01 parent project. In this one-year project, we aim to test the deployment of our dbCAN3 website on AWS by taking advantage of AWS web hosting service and AWS Batch for automatic workload distribution, and compare with two on-prem solutions in terms of their computational efficiency. To achieve these goals, we have assembled a multi-disciplinary research team including three faculty, one computer specialist, and one graduate students. We have all necessary expertise in bioinformatics, cloud computing, and high-performance computing. The successful completion of this cloud exploration project will significantly increase our knowledge using AWS for our R01 CAZyme bioinformatics tool development.

View original record on NIH RePORTER →