page 1  (18 pages)
2to next section

In D.N.L. Levy and D.F. Beal, editors, Heuristic Programming in Artificial Intelligence 2 { The Second Computer Olympiad. Ellis Horwood, 1991.

Exploratory Learning

in the Game of GO

Barney Pell1

University of Cambridge

Cambridge UK

E-mail: [email protected]

Abstract

This paper considers the importance of exploration to game-playing programs which learn by playing against opponents. The central question is whether a learning program should play the move which offers the best chance of winning the present game, or if it should play the move which has the best chance of providing useful information for future games. An approach to addressing this question is developed using probability theory, and then implemented in two different learning methods. Initial experiments in the game of Go suggest that a program which takes exploration into account can learn better against a knowledgeable opponent than a program which does not.

1 Introduction

One of the earliest aspirations of Artificial Intelligence was to develop computer game playing programs which could improve their play through experience, adapt their strategy to compete against a variety of opponents, and ultimately outplay their programmers. As in most learning problems, a program

1Parts of this work have been supported by RIACS, NASA Ames Research Center [FIA], and a British Marshall Scholarship.