Fundamental Limits for Information Retrieval
Regents Of The University Of Michigan - Ann Arbor, Ann Arbor MI
Investigators
Abstract
The fundamental limits of performance for a general model of information retrieval from databases are studied. In the scenarios to be considered a large quantity of information is to be stored on some physical storage device. Requests for information are modeled as randomly generated sequences with a known distributions. The requests are assumed to be "context-dependent", i.e., to vary according to the sequence of previous request. The state of the physical storage device is also assumed to depend on the history of previous requests. In general the logical structure of the information to be stored does not match the physical structure of the storage device, and consequently there are nontrivial limits on the minimum achievable average access times, where the average is over the possible sequences of user request. The propose research will apply information-theoretic methods to establish these limits. Preliminary results demonstrate the potential of such methods for this problem, giving very accurate results for some (relatively general) first cases. Allowing redundancy in general can greatly lower the achievable access times, even when the amount added is an arbitrarily small fraction of the total amount of information in the dababase. Preliminary results establish the achievable limits both with and without redundancy for some simplified cases; very many interesting problems arise in general.
View original record on NSF Award Search →