page 1  (8 pages)
2to next section

Applying Explanation-based Learning to Control and Speeding-up

Natural Language Generation

G?unter Neumann

DFKI GmbH

Stuhlsatzenhausweg 3

66123 Saarbr?ucken, Germany

[email protected]

Abstract

This paper presents a method for the automatic extraction of subgrammars to control and speeding-up natural language generation NLG. The method is based on explanation-based learning EBL. The main advantage for the proposed new method for NLG is that the complexity of the grammatical decision making process during NLG can be vastly reduced, because the EBL method supports the adaption of a NLG system to a particular use of a language.

1 Introduction

In recent years, a Machine Learning technique known as Explanation-based Learning EBL (Mitchell, Keller, and Kedar-Cabelli, 1986; van Harmelen and Bundy, 1988; Minton et al., 1989) has successfully been applied to control and speeding-up natural language parsing (Rayner, 1988; Samuelsson and Rayner, 1991; Neumann, 1994a; Samuelsson, 1994; Srinivas and Joshi, 1995; Rayner and Carter, 1996). The core idea of EBL is to transform the derivations (or explanations) computed by a problem solver (e.g., a parser) to some generalized and compact forms, which can be used very efficiently for solving similar problems in the future. EBL has primarily been used for parsing to automatically specialize a given source grammar to a specific domain. In that case, EBL is used as a method for adapting a general grammar and/or parser to the sub-language defined by a suitable training corpus (Rayner and Carter, 1996).

A specialized grammar can be seen as describing a domain-specific set of prototypical constructions. Therefore, the EBL approach is also very interesting for natural language generation (NLG). Informally, NLG is the production of a natural

language text from computer-internal representation of information, where NLG can be seen as a complex|potentially cascaded|decision making process. Commonly, a NLG system is decomposed into two major components, viz. the strategic component which decides `what to say' and the tactical component which decides `how to say' the result of the strategic component. The input of the tactical component is basically a semantic representation computed by the strategic component. Using a lexicon and a grammar, its main task is the computation of potentially all possible strings associated with a semantic input. Now, in the same sense as EBL is used in parsing as a means to control the range of possible strings as well as their degree of ambiguity, it can also be used for the tactical component to control the range of possible semantic input and their degree of paraphrases.

In this paper, we present a novel method for the automatic extraction of subgrammars for the control and speeding-up of natural language generation. Its main advantage for NLG is that the complexity of the (linguistically oriented) decision making process during natural language generation can be vastly reduced, because the EBL method supports adaption of a NLG system to a particular language use. The core properties of this new method are:

ffl prototypical occuring grammatical constructions can automatically be extracted;

ffl generation of these constructions is vastly sped up using simple but efficient mechanisms;

ffl the new method supports partial matching, in the sense that new semantic input need not be completely covered by previously trained examples;

ffl it can easily be integrated with recently developed chart-based generators as described in,