GGrantIndex
← Search

EAGER: Integrating Multi-Omics Biological Networks and Ontologies for lncRNA Function Annotation using Deep Learning

$300,000FY2023CSENSF

University Of Texas At Arlington, Arlington TX

Investigators

Abstract

Long non-coding RNAs (lncRNAs) are a class of ribonucleic acid (RNA) molecules longer than 200 nucleotides that do not encode proteins but play important regulatory roles in various biological and cellular processes such as cancer metastasis, therapeutic targets, immune responses, chromatin remodeling, and embryonic development. Despite exciting findings in recent years, the functions of most lncRNAs remain largely unknown as they are often transcribed from non-coding regions of the genome. Their functions are not always clear and lack conservation across species. The project aims to develop an efficient graph neural network method called Layer-stacked ATTention Embedding to Gene Ontology (LATTE2GO) to reliably annotate lncRNA functions describing each with various gene ontology features including molecular function, biological process, and cellular component. Research activities will engage minorities, women, and undergraduates performing interdisciplinary research through the Girl Engineering Summer Camp, Louis Stokes Alliances for Minority Participation Summer Research Academy, and the McNair programs at the University of Texas at Arlington. The research will aggregate gene ontology structure and multiple interactions between genes, transcripts, and proteins as a knowledge graph containing heterogeneous relationships. The project will (1) extract higher-order multi-omics interrelations from heterogenous interactions as well as multi-relational associations; (2) develop representation learning of lncRNA functions from multiple relationships in the hierarchical gene ontology within the same message-passing framework; and (3) explore attention graph neural networks to effectively aggregate heterogeneous interactions and gene ontology term pertinencies. By extracting higher-order associations and weighting them via attention, LATTE2GO aims to achieve significant gains over previous graph-based function prediction techniques. In addition, the architecture has the advantage of learning features directly from complete hierarchical ontology and connecting with lncRNA network relations in an end-to-end manner. The novel framework could be extended to integrating multiple heterogeneous data sources for generic computational and data science problems. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →