
Scheduling of Modular Architectures for
Inductive Inference of Regular Grammars
M. Gori, M. Maggini, and G. Soda
Dipartimento di Sistemi e Informatica, Universit?a di Firenze
Via di Santa Marta 3  50139 Firenze  Italy
Tel. +39 (55) 479.6265  Fax +39 (55) 479.6363
email : [email protected]
Abstract
The problem of inductive inference of regular grammars has recently been faced with recurrent neural networks by many researchers [Giles et al., 1992; Pollack, 1991; Watrous & Kuhn, 1992]. We claim that recurrent radial basis function (R2BF ) networks [Gori et al., 1993a] are very wellsuited for dealing with such inferential problems, because of their clustered representation of the network states.
On the other hand, the main problems that seem to affect the success of such inferential methods is that of gradient vanishing [Bengio et al., 1993] and of bifurcation of the weight space trajectory [Doya, 1993] when learning longterm dependencies , no matter what recurrent network is used.
In particular, in this paper, we propose using a modular neural network architecture in which the activations of each module are updated with their own rate. The updating rates and the connection of the different modules are arranged in such a way that the number of updating steps taking place in each module is significantly reduced with respect to the sequence length. Three architectural choices are explored in detail whose effectiveness is evaluated with experiments of inductive inference of regular grammars.
I Introduction
We have recently explored the integration of connectionist and symbolic processing for dealing with uncertain information [Frasconi et al., 1994a; Gori & Soda, 1993b]. The symbolic support of the models that we have used is a nondeterministic automaton where some transition rules are learned from examples. A more thourough understanding of these models has led us to investigate general problems of inductive inference of regular grammars [Angluin & Smith, 1983]. Unlike commonly approached neural network learning tasks, whose target is that of exhibiting a satisfactory generalization to new examples, the inductive inference of regular grammars requires extracting symbolic rules from the learned configuration. This extraction is likely to succeed if the learning process converges to weight configurations producing clustered representations in the state space. Most of the times, the learned configurations leads to an apparent automata behavior arising from the learning of sequences of small length. When acting on long sequences, the network behavior is likely to depart to some extent from the associated automaton.
The recently suggested techniques for extracting automata after learning [Giles et al., 1992] are interesting attempts to overcome this problem. However, an implicit assumption for a successful extraction of automata with clustering techniques is that the network state space is fairly wellseparated in clusters.