| ![]() |
Project Note
MULINEX
Multilingual Indexing, Navigation and Editing Extensions
for the World-Wide Web
Gregor Erbach, G?nter Neumann, Hans Uszkoreit
DFKI GmbH
Language Technology Lab
66123 Saarbr?cken
Germany
[email protected]
Abstract
This paper gives an overview of the project MULINEX, which is a "leading-edge application project" funded in the Telematics Application Programme (Language Engineering Sector) of the European Union. The goal of the project is the development of a set of tools to allow crosslanguage text retrieval for the WWW, concept-based indexing, navigation tools and webiste management facilities for multilingual WWW sites. The project takes a user-centered approach in which the user needs drive the development activities and set the research agenda.
1 Overview and Objectives
MULINEX is a "leading-edge application project" which addresses the requirements of two kinds of users: web content providers and service operators who wish to provide multilingual information, and the customers of such multilingual information services (henceforth referred to as end users). The objective of the project is to provide multilingual search, retrieval and navigation functionalities for the WWW.
Leading-edge application projects aim at advanced applications based on existing or emerging IC components and novel Language Engineering technologies. The goal is to meet user requirements dictated by socio-economic changes over the next few years. (from the call for project proposals for the Telematics Application Programme).
The socio-economic changes addressed by the MULINEX project are the emergence and widespread acceptance of the WWW, the increasing availability of gigabytes of information in different languages, and the increasing number of people with different mother tongues who need to find information on the web.
Providers of web search engines are already producing
localised versions for different countries (e.g., lycos.de
for Germany), but so far these provide only the user
interface and the advertisements in the local language,
but the search and retrieval process itself is not languageaware.
The technologies to be used in the project include a stateof-the-art
information retrieval system, advanced
linguistic processing tools (morphological analysis, information
extraction, lexical semantics), algorithms for
alignment of translated texts and terminology extraction,
and machine translation systems.
The intended prototype application can run entirely on
the server of a content provider or search service
operator, so that the end user needs only a standard web
browser such as Netscape Navigator, Alis Tango or
Microsoft Explorer. The project is committed to
supporting open web standards and will avoid
dependence on proprietary formats and solutions, in
order to make the results applicable to a wider user base.
The application will be realised as a group of interacting
tools which improve access to information (search and
navigation) in multilingual web document collections,
and support the creation and maintenance of multilingual
content for the web by information providers. The set of
tools will provide the following search, retrieval and
navigation functionality for the end user:
1. search by a combination of keywords, phrases, and concepts
2. retrieval of documents in different languages with one monolingual query through multilingual indexing
3. online generation and presentation of navigation maps or menus for supporting interactive refinement of query and search
4. exploitation of context and user profiling information for selecting relevant documents
In addition, it will offer functionalities for the management of multilingual websites. These will only be