GGrantIndex
← Search

Collaborative Research: Semantic Map of Biological Data Sources: Entity Identity and Path Characterization

$343,479FY2003CSENSF

Arizona State University, Scottsdale AZ

Investigators

Abstract

Collaborative Research: Semantic Map of Biological Data Sources: Entity Identity and Path Characterization A fundamental problem facing the biological researcher today is correctly identifying a specific instance of a biological entity, e.g., a specific gene or protein, and then obtaining a complete functional characterization of this entity instance by exploring a multiplicity of inter-related sources. While the diversity of available data presents an opportunity to attack this problem, it is accompanied by difficulties in harnessing and exploring data. This collaborative interdisciplinary, inter-institutional research project exploits the researchers' prior expertise on wrapper and mediator technology and apply domain specific semantic knowledge to this problem. The goal is the correct identification and complete characterization of scientific entities by exploring multiple data sources. This project will address three tasks: (1) Construction of a Semantic Map of biological data sources including unique identifiers; links between sources; attributes; and search and query; (2) Learning at the data source, schema, domain and instance level uses the concept of Identity Link to identify equivalent identifiers for the same instance, and (3) Learning the properties of physical links and paths that are implemented among multiple data sources. The topology of links (paths) based on their properties, and Link (Path) Equivalence is applied to the tasks of entity identity and characterization. The tools developed in this project will be accessible via the project Web site (http://www.umiacs.umd.edu/research/CLIP/BFEnt02/), and can be applied by biologists to support ongoing scientific investigations in cancer research and diabetes, resulting in a broad impact in medical research and treatment. The educational impact includes a course on data management solutions for bioinformatics researchers and a workshop on experiment planning using data sources for biologists.

View original record on NSF Award Search →