page 1  (12 pages)
2to next section

International Journal of Modern Physics C, Vol. 0, No. (1992) 000{000
c World Scientific Publishing Company

DAPHNE: DATA PARALLELISM
NEURAL NETWORK SIMULATOR?

PAOLO FRASCONI, MARCO GORI, and GIOVANNI SODA
Dipartimento di Sistemi e Informatica
University of Florence
Via di Santa Marta, 3 - 50139 Firenze (Italy)

In this paper we describe the guideline of Daphne, a parallel simulator for supervised recurrent neural networks trained by Backpropagation through time. The simulator has a modular structure, based on a parallel training kernel running on the CM-2 Connection Machine. The training kernel is written in CM Fortran in order to exploit some advantages of the slicewise execution model. The other modules are written in serial C code. They are used for designing and testing the network, and for interfacing with the training data. A dedicated language is available for defining the network architecture, which allows the use of linked modules.

The implementation of the learning procedures is based on training example parallelism. This dimension of parallelism has been found to be effective for learning static patterns using feedforward networks. We extend training example parallelism for learning sequences with full recurrent networks. Daphne is mainly conceived for applications in the field of Automatic Speech Recognition, though it can also serve for simulating feedforward networks.

Keywords: Recurrent Neural Networks, Connection Machine, Training Example Parallelism, Speech Recognition.

1. Introduction

Learning time is probably the least appealing feature of neural networks trained by Backprop-like algorithms. In these models, the optimization of connection weights is achieved by defining a quadratic error function and using gradient descent techniques to bring the error function to a minimum. Actually, the size of the experiments which can be carried out is limited by the power of the computer being used. For example, learning to discriminate phonetic features with a recurrent neural network (RNN) may require many days of computation using an ordinary workstation. The situation is even worst for complex tasks, such as isolated word recognition on large dictionaries.

At present, a couple of packages exist for simulating neural networks on supercomputers. Some of them are public domain software, such as NeuralShell, Aspirin and PlaNet. They run on various platforms, including Cray and workstations. Some

?This research was partially supported by MURST 40%.