View the PDF document Gathering and indexing rich fragments of the World Wide Web

Holmes, G., Rogers, W. J. (1997) Proc ICCE’97,Kuching, Sarawak, Malaysia, pp 554-562.

While the World-Wide Web (WWW) is an attractive option as a resource for teaching and research it does have some undesirable features. The cost of allowing students unlimited access can be high-both in money and time; students may become addicted to ’surfing’ the web-exploring purely for entertainment-and jeopardise their studies. Students are likely to discover undesirable material because large scale search engines index sites regardless of their merit. Finally, the explosive growth of WWW usage means that servers and networks are often overloaded, to the extent that a student may gain a very negative view of the technology. We have developed a piece of software which attempts to address these issues by capturing rich fragments of the WWW onto local storage media. It is possible to put a collection onto CD ROM, providing portability and inexpensive storage. This enables the presentation of the WWW to distance learning students, who do not have internet access. The software interfaces to standard, commonly available web browsers, acting as a proxy server to the files stored on the local media, and provides a search engine giving full text searching capability within the collection.