GGrantIndex
← Search

BioScholar: a Biomedical Knowledge Engineering framework based on the published l

$266,213R01FY2008GMNIH

University Of Southern California, Los Angeles CA

Investigators

Linked publications & trials

Abstract

[unreadable] DESCRIPTION (provided by applicant): Studying the primary research literature is a universal, primary activity for biomedical scientists. It underlies scientists' understanding of their subject and strengthens their capability to plan, execute, and interpret experiments. This proposal is concerned with the maintenance and continued development of software that supports scientists in their scholarly work. Our goal is to develop a knowledge engineering platform (called `BioScholar') to permit a single graduate student or postdoctoral worker to design, build, curate, and maintain a Knowledge Base (KB) for the literature of interest to a specific laboratory. This continues a previous software development project that was funded by the National Library of Medicine (LM 07061). We will continue to maintain the software using modern software engineering tools and approaches, whilst making it fully interoperable with a widely used ontology engineering platform (Protege /OWL). We will also develop the systems' existing capabilities to assist scientists with management of bibliographic data (citation information and full-text PDF articles). We will further develop tools to allow researchers to annotate PDF files with highlights, simple comments and with structured data. We will then use this annotation framework to drive the process of constructing knowledge bases using Protege/OWL (a widely used ontology editor). We will then incorporate Information Extraction (IE) techniques from modern Natural Language Processing (NLP) to improve the efficiency of this curation process. The NLP methods we use are based on the Conditional Random Fields (CRF) model which is considered state-of-the-art amongst NLP researchers. Finally, the most research-oriented component of this proposal is the development of a new methodology for knowledge representation and reasoning in biomedicine based on experimental design, involving experimental controls, independent and dependent variables, statistical significance and correlation between variables. This representation will be (a) understandable to experimental scientists, (b) lightweight, (c) versatile, and (d) capable of supporting inference between experiments. During the course of this project, we will build a KB for the world-leading neuroendocrinology laboratory of Prof. Alan Watts at University Southern California. Prof. Watts' work is concerned with the study of catecholaminergic control of the stress response, drawing on research from a large number of different fields (anatomy, physiology, molecular biology, etc.). After developing this KB, we will test its validity using subjective methods (questionnaires and interviews), and objective experiments (`mock exams' to see if students' performance with test questions based on comprehension of the primary literature). We will release all findings and tools to the biomedical community as research papers and open-source software. Narrative This project will help biomedical scientists manage, understand and communicate the complex information they must learn from scientific papers in multiple biomedical disciplines. As a demonstration of this work, we will build a comprehensive summary of research underlying brain circuits involved in stress. Stress and anxiety disorders are estimated to affect 19.1 million people in the USA, costing $42 billion in health costs per year (source: Anxiety Disorders Association of America). [unreadable] [unreadable] [unreadable]

View original record on NIH RePORTER →