| ![]() |
Optimal learning in artificial neural networks: a theoretical view
M. Bianchini and M. Gori
Dipartimento di Sistemi e Informatica
Universit?a di Firenze
Via di S. Marta, 3
50139 Firenze (Italy)
tel: +39 55 479.6265, fax: +39 55 479.6363
e-mail: [email protected].
Abstract
The effectiveness of connectionist models in emulating intelligent behaviour and solving significant practical problems is strictly related to the capability of the learning algorithms to find optimal or near-optimal solutions and to generalize to new examples. This paper reviews some theoretical contributions to optimal learning in the attempt to provide a unified view and give the state of the art in the field.
The focus of the review is on the problem of local minima in the cost function that is likely to affect more or less any learning algorithm. Starting from this analysis, we briefly review proposals for discovering optimal solutions and suggest conditions for designing architectures tailored to a given task.
1 Introduction
In the last few years impressive efforts have been made in using connectionist models either for modelling human behaviour and for solving practical problems. In the field of cognitive science and psychology, we have been witnessing a debate on the actual role of connectionism in modelling human behaviour. It has been claimed [1] that, like traditional Associationism, Connectionism treats learning as basically a sort of statistical modelling and that it is not adequate for capturing the rich structure of most significant cognitive processes. As for the actual novelty of the recent renew of Connections models, Fodor and Pylishyn [1] look quite skeptical and state We seem to remember having been through this argument before. We find ourselves with a gnawing sense of deja vu". A parallel debate has been taking place concerning the application of connectionist models to engineering (pattern recognition, artificial intelligence, motor control, etc.). The arguments addressed in these debates seem strictly related to each other and refer mainly to the peculiar kind of learning that is typically carried out in connectionist models, that seem not to take the structure enough into account. Unlike other symbolic approaches to machine learning, that are based on intelligent search" (see e.g.: [2]), in connectionist models the learning is typically framed as an optimization problem. After the seminal PDP's books, Minsky published an extended edition of Perceptron [3] that contains an intriguing epilogue on PDP's novel issues. He pointed out that