page 1  (7 pages)
2to next section

TDL|A Type Description Language

for Constraint-Based Grammars

Hans-Ulrich Krieger, Ulrich Schafer

fkrieger,[email protected]

German Research Center for Artificial Intelligence (DFKI)

Stuhlsatzenhausweg 3, D-66123 Saarbrucken, Germany

Abstract

This paper presents TDL, a typed feature-based representation language and inference system. Type definitions in TDL consist of type and feature constraints over the boolean connectives. TDL supports open- and closedworld reasoning over types and allows for partitions and incompatible types. Working with partially as well as with fully expanded types is possible. Efficient reasoning in TDL is accomplished through specialized modules. Topical Paper. Topic Area: software for NLP, grammar formalism for typed feature structures.

1 Introduction

Over the last few years, constraint-based grammar formalisms have become the predominant paradigm in natural language processing and computational linguistics. Their success stems from the fact that they can be seen as a monotonic, high-level representation language for linguistic knowledge which can be given a precise mathematical semantics. The main idea of representing as much linguistic knowledge as possible through a unique data type called feature structure, allows the integration of different description levels without taking care of interface problems. While the first approaches relied on annotated phrase structure rules (e.g., PATR-II), modern formalisms try to specify grammatical knowledge as well as lexicon entries entirely through feature structures. In order to achieve this goal, one must enrich the expressive power of the first unification-based formalisms with different forms of disjunctive descriptions. Later, other operations came into play, e.g., (classical) negation. Other proposals consider the integration of functional/relational dependencies into the formalism which make them in general Turing-complete (e.g., ALE [4]). However the most important extension to formalisms consists of the incorporation of types, for instance in modern systems like TFS [15], CUF [6], or TDL [7]. Types are ordered hierarchically as it is known from object-oriented programming languages. This leads to multiple inheritance in the description of linguistic entities. Finally, recursive types are necessary to describe at least phrase-structure recursion which is inherent in all grammar formalisms which are not provided with a context-free backbone.

In the next section, we argue for the need and relevance of using types in CL and NLP. After that, we give an overview of TDL and its specialized inference modules. Especially, we have a closer look on the novel features of TDL and present the techniques we have employed in implementing TDL.

2 Motivation

Modern typed unification-based grammar formalisms differ from early untyped systems in that they highlight the notion of a feature type. Types can be arranged hierarchically, where a subtype inherits monotonically all the information from its supertypes and unification plays the role of the primary informationcombining operation. A type definition can be seen as an abbreviation for a complex expression, consisting of type constraints (concerning the sub-/supertype relationship) and feature constraints (stating the appropriate attributes and their values) over the connectives ^, _, and :. Types serve as abbreviations for lexicon entries, ID rule schemata, and universal as well as language-specific principles as is familiar from HPSG. Besides using types as an abbreviational means as templates are, there are other advantages as well which cannot be accomplished by templates:

ffl structuring knowledge
Types together with the possibility to order them hierarchically allow for a modular and clean way to represent linguistic knowledge adequately. Moreover, generalizations can be put at the appropriate levels of representation.

ffl efficient processing
Certain type constraints can be compiled into efficient representations like bit vectors [1], where a GLB (greatest lower bound), LUB (least upper bound), or a ? (type subsumption) computation reduces to low-level bit manipulation; see Section 3.2. Moreover, types release untyped unification from expensive computation through the possibility to declare them incompatible. In addition, working with type names only or with partially expanded types minimizes the costs of copying structures during processing. This can only be accomplished if the system makes a mechanism for type expansion available; see Section 3.4.

ffl type checking
Type definitions allow a grammarian to declare which attributes are appropriate for a given type and which types are appropriate for a given attribute, therefore disallowing one to write inconsistent feature structures. Again, type expansion is necessary to determine the global consistency of a given description.

ffl recursive types
Recursive types give a grammar writer the opportunity to formulate certain functions or relations as recursive type specifications. Working in the type deduction paradigm enforces a grammar writer to replace the context-free back-