page 1  (7 pages)
2to next section

A Hybrid Learning Model for Reactive Sequential Decision Making

Ron Sun?

Department of Computer Science

The University of Alabama

Tuscaloosa, AL 35487

Todd Peterson

Department of Computer Science

The University of Alabama

Tuscaloosa, AL 35487

Abstract

In order to develop versatile agents that learn in situated contexts and generalize resulting knowledge to different environments, we explore the possibility of learning both types of knowledge in a hybrid connectionist architecture. The architecture is based on the twolevel idea proposed earlier by the authors. The architecture integrates reactive routines, rules, learning, and decision-making in a unified framework, and structures different learning components in a synergistic way to perform on-line, integrated learning.

Introduction

There has been a large amount of work demonstrating the difference between procedural knowledge and declarative knowledge (or conceptual and subconceptual knowledge; e.g., Anderson 1982, 1994, Keil 1989, Damasio et al. 1990, Sun 1994). It is believed that a balance of the two is essential to the development of complex cognitive agents. For example, one way to learn a sequential navigation task, such as navigating a maze, is through trial-and-error: repeated practice gradually gives rise to a set of procedural skills that deal specifically with the practiced situations and their minor variations. However, such skills may not be transferable to truly novel situations, since they are so embedded in specific contexts and tangled together. In order to deal with novel situations, the agent needs to discover some general rules. Generic knowledge helps to guide the exploration of novel situations, and reduces the time (i.e., the number of trials) necessary to develop specific skills in new situations. Generic knowledge can also help in communicating the process and the skill of navigation to other agents. If properly used, generic knowledge that is extracted on-line during learning can help to facilitate the very learning process itself.

There has been various work that deals only with one type of knowledge or the other, which include: work on reinforcement learning and work on autonomous systems (such as Mahadevan and Cornell 1992, Barto et

?This work is supported in part by Office of Naval Research under grant number N00014-95-1-0440.

al. 1990); rule learning and encoding, including connectionist versions of them (such as Sun 1992, Towell and Shavlik 1993). There are also existing AI models combining both types of knowledge, but they usually simply juxtapose the two, and they often do not perform integrated learning (cf. Gat 1992, Schneider and Oliver 1991, Hendler 1987). We will instead explore the possibility of a two-level integrated connectionist architecture in this work. The basic desiderata for this work are: the learning process should be on-line (in real-time), it should be autonomous and reactive, it should adapt to changing environments, and it should also be integrated (that is, different types of representations should be developed simultaneously along side each other).

In the rest of this paper, we introduce an architecture by first identifying the two-level structuring used in the architecture, and then discussing the realization of one level with a reinforcement learning network and the realization of the other level with a rule network. We present in detail experimental results that demonstrate the learning capability of the model, which is sometimes over and above its constituent models.

A Two-level Architecture

There have been various two-level architectures proposed, such as Hendler (1987), Gelfand et al. (1989), Schneider and Oliver (1991), and Sun (1992a, 1994). Among them, Consyderr (Sun 1992a) is the most integrated one. It consists of a concept level and a feature level. The representation is localist in the concept level, with one node for each concept, and distributed in the feature level, with a non-exclusive set of nodes for one concept.

Based on Consyderr, we will present a more complete and integrated architecture to tackle the problem of exploring both procedural and declarative knowledge in one framework. The architecture is called Clarion, or Connectionist Learning with Adaptive Rule Induction ON-line. It consists of two levels: the top level is a rule level and the bottom level is a reactive level; see Figure 1. The reactive level contains reactive routines (Agre and Chapman 1990), or procedural knowledge, acquired through connectionist reinforcement learning; and the rule level contains rules, or declarative knowledge, acquired through rule extraction (and a variety of other