
Baltzer Journals
Unimodal Loading Problems
Monica Bianchini1, Stefano Fanelli2, Marco Gori1, and Marco Protasi2
1Dipartimento di Sistemi e Informatica, Universit?a degli Studi di Firenze
Via di Santa Marta, 3  50139 Firenze  Italy
Tel. +39 (55) 479.6265  Fax +39 (55) 479.6363
Email: {monica,marco}@mcculloch.ing.unifi.it
2Dipartimento di Matematica, Universit?a di Roma Tor Vergata"
Via della Ricerca Scientifica  00133 Roma  Italy
Tel. +39 (6) 7259.4681  Fax +39 (6) 7259.4699
Email: {fanelli,protasi}@mat.utovrm.it
This paper deals with optimal learning and provides a unified viewpoint of most significant results in the field. The focus is on the problem of local minima in the cost function that is likely to affect more or less any learning algorithm. We give some intriguing links between optimal learning and the computational complexity of loading problems. We exhibit a computational model such that the solution of all loading problems giving rise to unimodal error functions require the same time, thus suggesting that they belong to the same computational class.
Keywords: Backpropagation, computational complexity, optimal learning, premature saturation, spurious and structural local minima, terminal attractor.
1 Learning as optimisation
Supervised learning in multilayered networks (MLNs) can be accomplished thanks to
Backpropagation (BP), which is used to minimise pattern misclassifications by means of
gradient descent for a particular nonlinear least squares fitting problem. Unfortunately,
BP is likely to be trapped in local minima and indeed many examples of local extremes
have been reported in the literature.
The presence of local minima derives essentially from two different reasons. First, they
may arise because of an unsuitable joint choice of the functions which defines the network
dynamics and the error function. Second, local minima may be inherently related to the
structure of the problem at hand. In [5], these two cases have been referred to as spurious
and structural local minima, respectively. Problems of suboptimal solutions may also
arise when learning with high initial weights, as a sort of premature neuron saturation
arises, which is strictly related to the neuron fanin. An interesting way of facing this
problem is to use the relative crossentropy metric " [10], for which the erroneous saturation
of the output neurons does not lead to plateaux, but to very high values of the