
Copyright ? 1996 by Leonard G. C. HameyAppears in Proc. Seventh Australian Conf. Artificial Neural Networks,pages 179183, 1996.
Analysis of the Error Surface of the XOR Network with Two
Hidden Nodes
Leonard G. C. Hamey
School of MPCE
Macquarie University NSW 2109 Australia
ABSTRACT
The exclusiveor learning task in a feedforward neural network with two hidden nodes is investigated. Constraint equations have been derived which fully describe the finite stationary points of the error surface. It is shown that the stationary points occur in a single connected union of eighteen manifolds. A Taylor series expansion is applied to the network error surface and it is shown that all points within the enumerated manifolds are arbitrarily close to points of lower error. It follows that the finite stationary points of the exclusiveor task are saddle points, not relative minima. This result is surprising in view of the commonly held belief that the exclusiveor task exhibits local minima. The present result complements a recent result which proves the absence of regional local minima in the exclusiveor task.
v1
u11
q1 u21
v2
u22
q2
u12
f
Fig. 1: Feedforward network to solve the exclusiveor task.
1. Introduction
It is well known that backpropagation1 learning can become trapped when being trained on the exclusiveor task with two hidden nodes (figure 1). However, the occurrence of trapped networks, which are commonly called local minima, has been observed to be rare (Rumelhart, Hinton and Williams, 1986) while depending upon the initial conditions and the network learning parameters (Kolen and Pollack, 1990). The present paper presents a theoretical analysis of the error surface of the exclusiveor task.
The study of the error surfaces of feedforward neural networks is hampered by high dimensionality and the difficulty of theoretical analysis. Although some results have been forthcoming, these are for restricted cases. Analyses exist for networks without hidden nodes (Brady, Raghavan and Slawny, 1989; Sontag and Sussmann, 1989; Sontag and Sussmann, 1991), networks comprised of linear nodes (Baldi and Hornik, 1989) and networks with
1Submission category: Theory. Preferred presentation: Poster.
as many hidden nodes as training patterns (Poston, Lee, Choie and Kwon, 1991). In general, networks with less hidden nodes than training patterns appear not to be amenable to analysis. A significant exception is the exclusiveor network (figure 1).
Blum (1989) proved the existence of solutions in the exclusiveor learning task. They attempted to prove the existence of a manifold of relative local minima in the error surface, but their proof was flawed as previously shown by Hamey (1995c) and SprinkhuizenKuyper and Boers (1994b). Lisboa and Perantonis (1991) characterise the stationary points of the error surface, obtaining four classes. Their classes (b){(d) occur only as points with infinite weight values but class (a) occurs for finite weight values. Hamey (1995a) proves that the exclusiveor task does not have any regional local minima. Other analysis of the exclusiveor network and related learning tasks may be found in (Gori and Tesi, 1992; Gori and Tesi, 1990; SprinkhuizenKuyper and Boers, 1994a; SprinkhuizenKuyper and Boers, 1994c).
The present paper extends the results of Hamey
(1995a) by considering finite relative local minima.
A point w0 is said to be a relative minimum of
a function f(w) if there exists ffl > such that
f(w0+?w) <= f(w0) for all j?wj < ffl. As discussed
in Hamey (1995a), this definition is unsuitable for
the consideration of minima that occur with infinite
weights, hence the adoption of an alternate definition
of local minimum in that paper. In the present
paper, we examine in detail the manifolds of weight
space that satisfy class (a) of Lisboa and Perantonis