| ![]() |
, , 1{29 ()
c Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.
Representation of Finite State Automata in
Recurrent Radial Basis Function Networks
PAOLO FRASCONI [email protected]
MARCO GORI [email protected]
MARCO MAGGINI [email protected]
GIOVANNI SODA [email protected]
Dipartimento di Sistemi e Informatica, Universit?a di Firenze
Via di Santa Marta 3 - 50139 Firenze - Italy
Received May 1, 1991
Editor: C. Lee Giles
Abstract. In this paper, we propose some techniques for injecting finite state automata into Recurrent Radial Basis Function networks (R2BF). When providing proper hints and constraining the weight space properly, we show that these networks behave as automata. A technique is suggested for forcing the learning process to develop automata representations that is based on adding a proper penalty function to the ordinary cost. Successful experimental results are shown for inductive inference of regular grammars.
Keywords: Automata, backpropagation through time, high-order neural networks, inductive inference, learning from hints, radial basis functions, recurrent radial basis functions, recurrent networks
1. Introduction
The ability of learning from examples is certainly the most appealing feature of
neural networks. In the last few years, several researchers have used connectionist
models for solving different kinds of problems ranging from robot control to pattern
recognition. Coping with optimization of functions with several thousands of
variables is quite common. Surprisingly, in many practical cases, global or nearglobal
optimization is attained also with non sophisticated numerical methods. For
example, successful applications of neural nets for recognition of handwritten characters
(le Cun, 1989) and for phoneme discrimination (Waibel et al., 1989) have
been proposed which do not report serious convergence problems.
Some attempts to understand the theoretical reasons for the successes and failures
of supervised learning schemes have been carried out which explain when such
schemes are likely to succeed in discovering optimal solutions (Bianchini et al.,
1994; Gori & Tesi, 1992; Yu, 1992), and to generalize to new examples (Baum
& Haussler, 1989). These results give some theoretical foundations to learning
from tabula rasa configurations, but unfortunately, the conditions they provide for
optimal convergence and for generalization are quite limited in practice. Although