A FUNDAMENTAL GOAL OF GENETICS RESEARCH IS TO CHARACTERIZE THE COMPLEX INTERACTIONS GOVERNING HOW AN INDIVIDUAL'S GENETIC LANDSCAPE (GENOTYPE) IMPACT THEIR OBSERVABLE TRAITS OR CHARACTERISTICS (PHENOTYPE). IN PARTICULAR, UNDERSTANDING HOW MUTATIONS OR VARIANTS WITHIN THE GENETIC LANDSCAPE ALTER NORMAL BIOLOGICAL OR CELLULAR FUNCTION IS CRUCIAL TO ELUCIDATING THE PHYSICAL MECHANISMS BRIDGING GENOTYPE TO PHENOTYPE. WHEN LOCATED WITHIN GENES (STRETCHES OF THE GENOME ENCODING PROTEINS), GENETIC VARIANTS ("ABNORMAL" DNA SEQUENCES) MAY IMPACT CELLULAR FUNCTION THROUGH MULTIPLE MEANS, INCLUDING MODIFYING NORMAL PROTEIN FUNCTION OR VARYING THE AMOUNT OF PROTEIN PRODUCED AT THE WHOLE-ORGANISM OR TISSUE-SPECIFIC SCALE (GENE EXPRESSION). IMPORTANTLY, WE KNOW THAT PHENOTYPE IS DRIVEN BY THIS GENE EXPRESSION WHICH OCCURS IN A TISSUE-SPECIFIC MANNER, I.E. DIFFERENT SETS OF GENES ARE EXPRESSED TO VARYING DEGREES IN DIFFERENT TISSUES. VARIANTS MAY ALSO BE LOCATED IN REGIONS OF THE GENOME THAT DONOT ENCODE FOR PROTEINS, REFERRED TO AS NON-CODING RNA. HISTORICALLY, THESE NON-CODING RNA MOLECULES HAVE BEEN OFTEN OVERLOOKED; HOWEVER, MORE RECENT RESEARCH HAS IMPLICATED NON-CODING RNAS IN A RANGE OF DISEASE PHENOTYPES INCLUDING CANCER, DIABETES, ALZHEIMER'S, CARDIOVASCULAR, AUTOIMMUNE, AND MUSCULOSKELETAL DISEASE. THUS, THE CATALOGING OF BOTH TISSUE-SPECIFIC GENE EXPRESSION AND OBSERVED GENETIC VARIANTS IN CODING AND NON-CODING REGIONS IS ESSENTIAL TO FULLY DISENTANGLE THE LINKAGES BETWEEN GENOTYPE AND PHENOTYPE. IN ORDER TO GENERATE THIS DETAILED CATALOG OF GENE EXPRESSION AND VARIATION, A FINE-SCALE, HIGH-QUALITY DESCRIPTION OF THE GENETIC LANDSCAPE IS REQUIRED. THIS INCLUDES COORDINATE (I.E. POSITIONAL LOCATIONS OF GENES IN THE GENOME) AND FUNCTION (I.E. PREDICTED BIOLOGICAL FUNCTIONS OF GENES IN THE GENOME) DESCRIPTIONS REFERRED TO AS THE PHYSICAL AND FUNCTIONAL ANNOTATION, RESPECTIVELY. WITHOUT ACCURATE PHYSICAL AND FUNCTIONAL ANNOTATION, DETERMINING THE GENOTYPIC CAUSE OF A GIVEN PHENOTYPE IS EXCEEDINGLY DIFFICULT. UNFORTUNATELY, HIGH-QUALITY GENOMIC ANNOTATIONS OF MANY DOMESTIC ANIMAL SPECIES, INCLUDING THE HORSE, ARE LACKING. INDEED, WE ESTIMATE THAT NEARLY HALF OF THE PROTEIN-CODING GENES IN THE MOST RECENTLY RELEASED HORSE GENOME ARE INCORRECTLY ANNOTATED. CORRECT PHYSICAL AND FUNCTIONAL ANNOTATION OF PROTEIN-CODING AND NON-CODING GENES IS NECESSARY TO DETERMINE THE IMPACT OF A GENETIC VARIANT ON GENE FUNCTION AND SUBSEQUENT PHENOTYPE.RNA-SEQUENCING (RNA-SEQ), IS ONE SUCH METHOD THAT CAN BE LEVERAGED TOWARD IMPROVING ANNOTATION AND CATALOGING GENE EXPRESSION. BRIEFLY, RNA-SEQ IS A HIGH-THROUGHPUT SEQUENCING METHODOLOGY THAT ENABLES BOTH THE CHARACTERIZATION AND QUANTIFICATION OF ALL PROTEIN-CODING AND NON-CODING GENES RESULTING IN A TRANSCRIPTOME (COLLECTION OF THE "COMPLETE" SET OF GENES) FOR A GIVEN SAMPLE OR TISSUE. MOREOVER, BY COMPILING AND SYNTHESIZING TRANSCRIPTOMES ACROSS A VARIETY OF TISSUE TYPES FROM MULTIPLE INDIVIDUALS, A COMPREHENSIVE TISSUE EXPRESSION ATLAS MAY BE GENERATED. LARGE-SCALE TRANSCRIPTOME PROJECTS HAVE BEEN UNDERTAKEN IN BETTER-CHARACTERIZED ORGANISMS (E.G. MICE, RATS, AND HUMANS) BUT HAVE BEEN TRADITIONALLY UNDERSERVED IN AGRICULTURAL SPECIES. IMPROVING FUNCTIONAL ANNOTATION IS AN ESSENTIAL STEP IN LINKING GENOTYPE TO PHENOTYPE. CO-EXPRESSION NETWORKS CAN BE UTILIZED TO INFER FUNCTION OR PROVIDE A FUNCTIONAL CONTEXT FOR GENES WITH UNKNOWN OR NOVEL FUNCTIONS. CO-EXPRESSION NETWORKS SUMMARIZE THE SIMILARITIES IN GENE EXPRESSION BETWEEN LARGE SETS OF GENES. THE CENTRAL IDEA BEHIND THE ANALYSIS OF CO-EXPRESSION NETWORKS IS THAT GENES WITH SIMILAR EXPRESSION ACROSS MULTIPLE TISSUES, LIKELY SHARE A BIOLOGICAL FUNCTION, IN ESSENCE CO-EXPRESSION INFORMS FUNCTION USING "GUILT BY ASSOCIATION". THUS, CO-EXPRESSION NETWORKS CAN BE UTILIZED TO INFER FUNCTION OR PROVIDE A FUNCTIONAL CONTEXT FOR GENES WITH UNKNOWN OR NOVEL FUNCTIONS.WE OVERCOME THESE LIMITATIONS BY GENERATING RNA-SEQ DATA FOR THE IDENTIFICATION ANDQUANTIFICATION OF BOTH CODING AND NON-CODING GENES ACROSS A DIVERSE SET OF TISSUES IN A SINGLE SET OF HORSES, CREATING A TISSUE GENE EXPRESSION ATLAS. WE WILL DEVELOP A PUBLICLY AVAILABLE GENOME ANNOTATION PIPELINE TO IMPROVE THE PHYSICAL ANNOTATION OF THE EQUINE GENOME. THROUGH QUANTIFICATION OF MULTIPLE GENES IN THE SAME TISSUE, WILL BE BUILD ACROSS-TISSUE (I.E. AT THE ORGANISMAL LEVEL) AND TISSUE-SPECIFIC CO-EXPRESSION NETWORKS TO INFER FUNCTIONAL RELATIONSHIPS BETWEEN CODING AND NON-CODING GENES.IDENTIFYING AND CHARACTERIZING THE LINK BETWEEN GENOTYPE AND PHENOTYPE IS CRUCIAL FOR ELUCIDATING THE UNDERLYING MECHANISMS OF HEALTH AND DISEASE. AS IS THE CASE WITH MANY AGRICULTURAL ANIMAL SPECIES, INCOMPLETE OR INCORRECT REFERENCE GENOME ANNOTATION LIMITS THE ABILITY TO PREDICT THE IMPACT OF A GENETIC VARIANT ON OBSERVABLE TRAITS OR DISEASE. THE IMPROVED ANNOTATION GENERATED BY THIS PROPOSAL WILL BENEFIT RESEARCH COMMUNITY-WIDE, ENABLING EQUINE RESEARCHERS ACROSS THE GLOBE TO BETTER IDENTIFY AND PRIORITIZE GENES RESPONSIBLE FOR A PHENOTYPE OF INTEREST AND MAKE MORE ACCURATE GENETIC VARIANT IMPACT PREDICTIONS. ADDITIONALLY, THE TISSUE EXPRESSION ATLAS WILL BE OF GREAT VALUE TO OTHER EQUINE RESEARCHERS IN ASSESSING NORMAL TISSUE-SPECIFIC GENE EXPRESSION TO COMPARE AGAINST GENETIC VARIANTS OF INTEREST. FINALLY, BY MAKING OUR ANALYSIS PIPELINE PUBLICLY AVAILABLE AND SPECIES AGNOSTIC, AGRICULTURAL RESEARCHERS AT LARGE CAN GENERATE TISSUE OR DISEASE SPECIFIC TRANSCRIPTOMES BASED ON THEIR OWN RNA-SEQ DATA.
$145,655FY2020National Institute of Food and AgricultureUSDA
Regents Of The University Of Minnesota