Applications of neural networks in Computer Go
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
A conventional brute force game program defines a board evaluation function that heuristically measures the merit of each board position. It finds the best computer response by exhaustively searching through a game tree, whose edges are the moves and nodes are the resulting board positions. The oriental board game Go does not succumb to this brute force approach, because of the high branching factor at the leaf nodes of the search tree, and a computationally intractable evaluation function. This dissertation explores reinforcement learning for implementing a computer program that plays Go. Reinforcement learning has been highly successful in learning the evaluation functions for a fewgames, such as checkers, Othello and backgammon. Two different reinforcement learning approaches to computer Go have been examined in this research.
The first approach combines the temporal difference method with the mixture-of-experts architectures for neural networks. The networks learned reasonable Go evaluation functions for 9x9 boards and were able to defeat a public domain, rule-based Go program.
The second approach features a hybrid and modular architecture that incorporates search techniques with machine learning algorithms. This approach implemented 19x19 Go. After training, the network evaluation function was able to correctly score 69 professional games out of 100 test games, to an error margin of ±5 points. The selective move generator in the network learned to screen professional Go moves more than 63% of the time within the top 20 network moves on 10,000 test positions. A prognosis for future research on Go is also discussed. This hybrid system of neural networks and artificial intelligence used in 19x19 Go is suitable for attacking complex search and decision problems, such as gene sequencing.