CAREER: Statistical Information Retrieval Modeling for Complex Search
Georgetown University, Washington DC
Investigators
Abstract
With the increasing popularity of Web applications and users' deep involvement in the Web, search engines face great challenges with a new degree of complexity. For instance, location-based services collect more complex contextual information such as geo-locations, season, time and temperature. Users' search activities have become more complex and usually task-based generating a variety of feedback and engagement signals such as clicks, mouse movements, eye tracking results, and query reformulations. Moreover, search is not only an individual user's personalized activity, but also activities shared by many users with similar information needs. Search engines are presented with the richest types of information and the largest amount of data ever and the complexity of the available information is tremendous. This demands that search engines be upgraded from retrieval systems that basically look for documents for single queries to decision engines that can pick the best choices for information seeking tasks. Through disseminating research results in papers and tools, the project will make three types of broad impact. First, the techniques developed in this project will benefit a broad population of everyday users and empower them to deal with complex, task-oriented web search. Second, the algorithms and software developed will provide fellow researchers and practitioners a handful of useful tools for solving IR problems incorporating dynamics. Third, the project will reach out to middle school girls and elementary school students. It will be easy for any search engine user to start using the proposed new search engine. However, to be an expert on IR, students need to be good at mathematics, natural language processing, user interface, artificial intelligence, and programming. This will be an excellent project to attract young people and minorities to these STEM disciplines. This project aims to create the next generation search engines, to be more specific, decision engines. The focus will be on designing, experimenting, and deploying statistical models for modeling the dynamics presented in the search process. The technical challenges are: (1) given the complexity of the available data, integrating a search engine appropriately into the right places in the larger context for the ultimate information seeking tasks; (2) providing theoretical and practical support to formal modeling of user engagement and other dynamics in retrieval models for better retrieval effectiveness; (3) modeling a user's exploration in the information space and optimizing a search engine's actions and algorithms; and (4) modeling interactions between a user and a search engine as well as interactions among multiple users, creating the dynamic environment for them all to interact and to game with each other and achieve a win-win optimization. The success of this project will start a new research field in IR: dynamic IR modeling. The results of this research will be highly influential with great impact on the next generation search engines. The work will build a foundation for future advances in the fields of reinforcement learning in IR and game theory in IR.
View original record on NSF Award Search →