Yesterday, we reported that an artificial Go player had defeated one of the top human players for the first time, in a best of five match.
From this analysis of the game, it seems that (at least) two things were at play here (Hat tip PB).
The first is called ‘Manipulation’, which is a technique used to connect otherwise unrelated parts of the board. My understanding of it is that you make two (or more!) separate positions on the board, one which is bad unless you get an extra move, and the other which might allow you to get an extra move. Since the two locations are separate, the player has to have a very specific sense of non-locality in order to be able to play it correctly.
To me, this feels like an excellent example of why Go is so difficult to solve computationally, and why there is still much fertile ground here for research.
The second seems to be an instance of what is called the ‘Horizon Effect‘. Simply put, if you only search a possible gameplay tree to a certain depth, any consequences below that depth will be invisible to you. So, if you have a move which seems to be good in the short term, but has terrible consequences down the road, a typical search tree might miss the negative consequences entirely. In this particular case, the supposition is that Sedol’s brilliant move 78 should have triggered a ‘crap, that was a brilliant move, I need to deal with that’, instead of an ‘now all the moves I was thinking of are bad moves, except for this subtree, which seems to be okay as far out as I can see’. The fact that at move 87 AlphaGo finally realized something was very wrong supports this hypothesis.
Is the Horizon effect something you can just throw more machine learning at? Isn’t this what humans do?
Specifically, the idea that two things can be related only by the fact that you can use resources from one to help the other.
One wonders what types of ‘Quiescence Search‘ AlphaGo was using that it missed this.