SBIR Phase II: Xtractica - A System for Extracting Coherent Data from Documents
Xsb, Inc., New York NY
Investigators
Abstract
This Small Business Innovation Research Phase II project will implement a software system that allows domain experts to specify programs that transform unstructured or partially structured data from a variety of document sources, such as World Wide Web sites, PDF files, and text into structured, coherent, and readily usable information. The system will consist of a set of tightly integrated syntactic and semantics-driven data extraction technologies that are managed from a graphical user interface. The goal will be to retrieve information that was created for human understandability, and work with it to create knowledge that can support automated decision-making and transactions. The system will empower users, who are knowledgeable about their application domains but are not necessarily trained as computing technologists, to rapidly structure data into knowledge. The Phase II implementation effort will build upon the results from the Phase I feasibility study to produce a fully functional system. Phase III will make the system commercially available to clients with diverse business interests including content aggregation, e-procurement, ERP, and supply chain management vendors.
View original record on NSF Award Search →