page 1  (10 pages)
2to next section




L. Lastrucciy, G. Bellesiz, M. Goriy, and G. Soday

yDipartimento di Sistemi e Informatica
Universit?a di Firenze
Via di Santa Marta 3 - 50139 Firenze - Italy
Tel. +39 (55) 479.6361 - Fax +39 (55) 479.6363
e-mail : [email protected]

zSofteam Applicazioni di Base
Via P. Carpini 1 - 50127 Firenze - Italy
Tel. +39 (55) 422.1494 - Fax +39 (55) 434.126

Abstract - In this paper, we propose a modular architecture where the interactions among different modules are controled by proper autoassociators. The outputs of these modules are computed by sigma p-neurons whose inputs come from both a feedforward network performing classification and an autoassociator. The outputs of the autoassociators are used for performing pattern rejection, thus reducing significantly the problems due to interaction of different modules. The proposed architecture is validated by experiments of speaker independent phoneme recognition on continuous speech with TIMIT data base with very promising results.


In the last few years many researchers have focussed their efforts in specializing neural networks more or less related to Backpropagation learning scheme for phoneme recognition. Unlike the challenging results obtained concerning phoneme discrimination, so far no enough care has been placed to the scaling up of similar solutions. This is certainly an important issue for any practical application. As Jacobs identified [1], there are two problems with monolithic networks, namely spatial and temporal crosstalk, which lead us to believe that modular systems are necessary for training nets on complex problems like phoneme recognition. Spatial crosstalk occurs when different groups of units serve different tasks; in this case hidden units being trained to resolve resid-