Computational Capabilities of Local-Feedback Recurrent
Networks Acting as Finite-State Machines
Dipartimento di Sistemi e Informatica
Universit?a di Firenze (Italy)
July 28, 1995
In this paper we explore the expressive power of recurrent networks with local feedback connections for symbolic data streams. We rely on the analysis of the maximal set of strings that can be shattered by the concept class associated to these networks (i.e., strings that can be arbitrarily classified as positive or negative), and find that their expressive power is inherently limited, since there are sets of strings that cannot be shattered, regardless of the number of hidden units.
The computational capabilities of artificial neural networks have recently received the attention of many researchers, particularly for the case of multilayer perceptrons (see e.g. ). For the case of recurrent neural networks (RNNs), many efforts have focused on symbolic tasks in which the input is restricted to a finite alphabet. A noticeable result is the one by Siegelman & Sontag , stating that such networks are computationally equivalent to Turing machines. Many researchers, however, have raised the question whether recurrent networks with a constrained architecture are still endowed with universal computational capabilities. The issue is motivated by at least two factors. First, fully recurrent networks are likely to be more difficult to train than other recurrent networks with a constrained architecture . Second, networks with a simpler architecture may yield better generalization to new examples, according to some generic parsimony principle. Some results on the computational capabilities of recurrent networks with constrained architecture have also been established. Siegelman et al. , for example, have shown that recurrent nonlinear autoregressive (NARX) networks (in which feedback is allowed from the output to the input, but not among the hidden units) are also Turing equivalent, except for a linear slowdown.
In this paper we consider a class of networks in which the recurrent connections are restricted to self-loops in a subset of the computational units. This is the simplest form of recurrence that can be considered. Networks having such topological constraint are referred to as local feedback recurrent networks (LFRNs)  or globally feedforward locally recurrent networks . One advantage of these networks is that they can be trained using a special version of the error backpropagation algorithm that is local in both space and time (see, e.g., [7, 8, 9]), thus allowing on-line training without incurring in the heavy computational burden of other local algorithms such as real-time recurrent learning (see, e.g., ).
Horne & Giles  have experimentally found that, despite their apparently weak expressive power, LFRNs are unexpectedly adequate for inferring finite memory grammars . However, to the best of our knowledge no attempt has been reported to theoretically characterize the class of FSMs that can be simulated by recurrent networks having local feedback connections only.