| ![]() |
Natural language processing
for
information retrieval
David D. Lewis
AT&T Bell Laboratories
Karen Sparck Jones
Computer Laboratory, University of Cambridge
July 1993
1 Abstract
The paper summarizes the essential properties of document retrieval and reviews both conventional practice and research findings, the latter suggesting that simple statistical techniques can be effective. It then considers the new opportunities and challenges presented by the ability to search full text directly (rather than e.g. titles and abstracts), and suggests appropriate approaches to doing this, with a focus on the role of natural language processing. The paper also comments on possible connections with data and knowledge retrieval, and concludes by emphasizing the importance of rigorous performance testing.
This paper will appear in Communications of the ACM.