SGER: A Digital Library Archive for Computer Scientists
Pennsylvania State Univ University Park, University Park PA
Investigators
Abstract
CiteSeer, a computer science document search engine and digital library, has become over the last three years the search engine of choice for exploring and accessing computer science documents for computer scientists and related disciplines. Not only can the searcher find actual documents but also many other related linked aspects of document information such as citations, relevance of citations, active bibliography and co-citations. CiteSeer is so popular with the computer science community that it gets over 100,000 hits and thousands of document downloads a day. CiteSeer currently has over 500,000 documents with over 10 million citations. The goal of this research is to explore methods for making CiteSeer a permanent research fixture for the computer science community. Archiving and URL permanence methods, plus methods for enhanced archive mirroring and archive scaling will be investigated. Availability of CiteSeer resources is promoted through exploratory APIs (application programming interfaces) for researcher data access. In addition, new linked information such as organizations and individuals in acknowledgements will be correlated with citation rankings. Exploratory research investigating new document crawling algorithms and procedures further boost the efficiency and functionality of CiteSeer. This project will have broad impact by providing computer scientists enhanced access to the computer science literature and promoting the participation of the researchers in building the CiteSeer by new additions to the document database.
View original record on NSF Award Search →