page 1  (5 pages)
2to next section

Copyright ? 1996 by Leonard G. C. HameyAppears in Proc. Seventh Australian Conf. Artificial Neural Networks,pages 179-183, 1996.

Analysis of the Error Surface of the XOR Network with Two

Hidden Nodes

Leonard G. C. Hamey

[email protected]

School of MPCE

Macquarie University NSW 2109 Australia

ABSTRACT

The exclusive-or learning task in a feed-forward neural network with two hidden nodes is investigated. Constraint equations have been derived which fully describe the finite stationary points of the error surface. It is shown that the stationary points occur in a single connected union of eighteen manifolds. A Taylor series expansion is applied to the network error surface and it is shown that all points within the enumerated manifolds are arbitrarily close to points of lower error. It follows that the finite stationary points of the exclusive-or task are saddle points, not relative minima. This result is surprising in view of the commonly held belief that the exclusive-or task exhibits local minima. The present result complements a recent result which proves the absence of regional local minima in the exclusive-or task.

v1

u11

q1 u21

v2

u22

q2
u12

f

Fig. 1: Feed-forward network to solve the exclusive-or task.

1. Introduction

It is well known that back-propagation1 learning can become trapped when being trained on the exclusive-or task with two hidden nodes (figure 1). However, the occurrence of trapped networks, which are commonly called local minima, has been observed to be rare (Rumelhart, Hinton and Williams, 1986) while depending upon the initial conditions and the network learning parameters (Kolen and Pollack, 1990). The present paper presents a theoretical analysis of the error surface of the exclusive-or task.

The study of the error surfaces of feed-forward neural networks is hampered by high dimensionality and the difficulty of theoretical analysis. Although some results have been forthcoming, these are for restricted cases. Analyses exist for networks without hidden nodes (Brady, Raghavan and Slawny, 1989; Sontag and Sussmann, 1989; Sontag and Sussmann, 1991), networks comprised of linear nodes (Baldi and Hornik, 1989) and networks with

1Submission category: Theory. Preferred presentation: Poster.

as many hidden nodes as training patterns (Poston, Lee, Choie and Kwon, 1991). In general, networks with less hidden nodes than training patterns appear not to be amenable to analysis. A significant exception is the exclusive-or network (figure 1).

Blum (1989) proved the existence of solutions in the exclusive-or learning task. They attempted to prove the existence of a manifold of relative local minima in the error surface, but their proof was flawed as previously shown by Hamey (1995c) and Sprinkhuizen-Kuyper and Boers (1994b). Lisboa and Perantonis (1991) characterise the stationary points of the error surface, obtaining four classes. Their classes (b){(d) occur only as points with infinite weight values but class (a) occurs for finite weight values. Hamey (1995a) proves that the exclusive-or task does not have any regional local minima. Other analysis of the exclusive-or network and related learning tasks may be found in (Gori and Tesi, 1992; Gori and Tesi, 1990; SprinkhuizenKuyper and Boers, 1994a; Sprinkhuizen-Kuyper and Boers, 1994c).

The present paper extends the results of Hamey (1995a) by considering finite relative local minima. A point w0 is said to be a relative minimum of a function f(w) if there exists ffl > such that
f(w0+?w) <= f(w0) for all j?wj < ffl. As discussed in Hamey (1995a), this definition is unsuitable for the consideration of minima that occur with infinite weights, hence the adoption of an alternate definition of local minimum in that paper. In the present paper, we examine in detail the manifolds of weight space that satisfy class (a) of Lisboa and Perantonis