Collaborative Research: Pattern Matching - Theory and Practice
Georgia Tech Research Corporation, Atlanta GA
Investigators
Abstract
The advent of the exploding Internet and the terra-bytes of heterogeneous data in it have made more acute the topic of searching such large digital databases. The challenges involved go beyond questions in information retrieval. One may envision seeking images in films, seeking words in voice data, and seeking phrases in compressed files and in files of various types. These challenges have boosted the appearance of myriad start-up companies and ad-hoc methods for the various tasks. The PI's approach has been a basic bottom-up long-term study of the theory of searching. They have a large center of pattern matching research that has pursued and continues to pursue understanding of the theoretical underpinnings of generalized searching, coupled with applications of their various ideas. the current research will continue the investigation of issues in generalized searching. In particular: 1. Approximate indexing with a small number of errors. 2. In-place compressed search. 3. ``Reusable'' dynamic programming code. 4. Parameterized matching with ``don't care''s. Research on the theory of image processing will also be continued. The particular areas of concentration are: 1. The effect of digitization. 2. Real multi-dimensional scaling. 3. Efficient search of rotated images. The investigators' research group has started a program of selective implementation of advanced pattern matching ideas, some in conjunction with research groups from other application areas. There are plans to implement text fingerprinting ideas and test their applicability in IR. In addition a project is planned that incorporates many of the ideas on searching compressed and heterogeneous files by constructing an automatic scientific home-page generator and maintainer (guaranteed to be an instant hit with all professors of Computer Science!).
View original record on NSF Award Search →