| ![]() |
Recurrent Neural Networks and Prior Knowledge for Sequence
Processing: A Constrained Nondeterministic Approach
Paolo Frasconi, Marco Gori, and Giovanni Soda
Dipartimento di Sistemi e Informatica
Via di Santa Marta 3 - 50139 Firenze (Italy)
e-mail: fpaolo,[email protected]
Knowledge-Based Systems, to appear.
January 20, 1995
In this paper we focus on methods for injecting prior knowledge into adaptive recurrent
networks for sequence processing. In order to increase the flexibility needed for specifying
partially known rules, we propose a nondeterministic approach for modeling domain knowledge.
The algorithms presented in this paper allow to map time-warping nondeterministic
automata into recurrent architectures with first-order connections. This kind of automata is
suitable for modeling temporal scale distortions in data such as acoustic sequences occurring
in problems of speech recognition. The algorithms output a recurrent architecture and a
feasible region in the connection weight space. We demonstrate that, as long as the weights
are constrained into the feasible region, the nondeterministic rules introduced using prior
knowledge are not destroyed by learning. This paper focuses primarily on architectural
issues, but the proposed method allows the connection weights to be subsequently tuned to
adapt the behavior of the network to data.
Keywords: Recurrent neural networks, prior knowledge, rule insertion, nondeterministic
finite automata.
INTRODUCTION
The integration of connectionist and symbolic processing approaches has recently received attention by many researchers [1, 2, 3, 4, 5], mainly because it allows to jointly exploit the bottom-up (learning from data) and top-down (deductive reasoning) kinds of inference. A relevant aspect to such integration is the development of techniques for embedding rulebased domain knowledge into connectionist models. In this paper we focus on processing sequential streams of data by recurrent neural networks.
There are two main benefits that can be gained by introducing prior knowledge into an adaptive network: improving generalization to new instances and simplifying learning.