UCT algorithm to beat go master
Mogo and the UCT algorithm beat a human opponent at master level in the game of Go for the first time on March 22, 2008 in a tournament organized by the French Federation of Go.
The game of go
It is a game of
Chinese origin which is played on a board named goban of 19 x 19 cases, but
that is often reduced to 13 x 13 or 9 x 9 cases. The players have white or black
stones they pose at the vacant intersections of lines of the board, alternately,
in order to encircle the stones of the opponent to score points.
In this historical duel against the computer, the goban used was 9 x 9.
The game of Go resembles to Othello that be played on a 8 x 8 cases board. In Othello stones surrounded take color of stones which surround them and therefore change their owner.
Othello software with the computer as opponent exist for a long time and to replace the human player, they use simple algorithms such as alpha-beta.
The Go game has nothing to do with Gomoku which is derived from the FiveInLine game. This a 3x3 boxes and the aim is to create a diagonal or line or three pieces of the same colour.
Mogo and the UCT algorithm
In 1998, the Deeper Blue program from IBM has fought for the first time a chess master, Gary Kasparov. However, the game of Go is more difficult to program because the number of moves to predict is virtually unlimited.
The best Go program is currently Mogo, developed by INRIA, CNRS and Paris-Sud University. This program uses the UCT algorithm to browse the tree of possible moves.
UCT, Upper bound for Confidence Tree is derived from UCB, Upper
Confidence Bounds and includes the use of Monte Carlo as evaluation
function.
The tree is crawled from the root, that is the current position of stones
and is driven by the UCT algorithm until a terminal node is reached.
At this point an evaluation function is applied that is named Monte-Carlo. The UCT algorithm travels several times the same branches, but they are
evaluated every time and therefore it will tend to go the most promising
branches.
The algo has been improved by applying the Monte-Carlo simulation which consist to fill randomly the board
with stones and count the points for each player, to give a value to this
terminal node.
The name Monte-Carlo refers to casinos in the principality and thus to chance, which is the basis of the method
used.
Reference: UCT and Monte Carlo. (PDF)