Lexically-generated subject hierarchies for browsing large collections

Nevill-Manning, C. G., Witten, I. H., Paynter, G. W. (1999) International Journal on Digital Libraries 2,111-123. Springer-Verlag.

Developing intuition for the content of a digital collection is difficult. Hierarchies of subject terms allow users to explore the space of topics that a collection covers, to form and specialize useful query terms, and to directly identify interesting documents. We describe two interfaces for navigating such hierarchies, and present a technique for inferring hierarchies automatically from large corpora. We also discuss scalability issues for the techniques involved, and our solutions to these problems.