Decision forests for computer Go feature learning

Van Niekerk, Francois (2014-04)

Thesis (MSc)--Stellenbosch University, 2014.

Thesis

ENGLISH ABSTRACT: In computer Go, moves are typically selected with the aid of a tree search algorithm. Monte-Carlo tree search (MCTS) is currently the dominant algorithm in computer Go. It has been shown that the inclusion of domain knowledge in MCTS is able to vastly improve the strength of MCTS engines. A successful approach to representing domain knowledge in computer Go is the use of appropriately weighted tactical features and pattern features, which are comprised of a number of hand-crafted heuristics and a collection of patterns respectively. However, tactical features are hand-crafted specifically for Go, and pattern features are Go-specific, making it unclear how they can be easily transferred to other domains. As such, this work proposes a new approach to representing domain knowledge, decision tree features. These features evaluate a state-action pair by descending a decision tree, with queries recursively partitioning the state-action pair input space, and returning a weight corresponding to the partition element represented by the resultant leaf node. In this work, decision tree features are applied to computer Go, in order to determine their feasibility in comparison to state-of-the-art use of tactical and pattern features. In this application of decision tree features, each query in the decision tree descent path refines information about the board position surrounding a candidate move. The results of this work showed that a feature instance with decision tree features is a feasible alternative to the state-of-the-art use of tactical and pattern features in computer Go, in terms of move prediction and playing strength, even though computer Go is a relatively well-developed research area. A move prediction rate of 35.9% was achieved with tactical and decision tree features, and they showed comparable performance to the state of the art when integrated into an MCTS engine with progressive widening. We conclude that the decision tree feature approach shows potential as a method for automatically extracting domain knowledge in new domains. These features can be used to evaluate state-action pairs for guiding searchbased techniques, such as MCTS, or for action-prediction tasks.

AFRIKAANSE OPSOMMING: In rekenaar Go, word skuiwe gewoonlik geselekteer met behulp van ’n boomsoektogalgoritme. Monte-Carlo boomsoektog (MCTS) is tans die dominante algoritme in rekenaar Go. Dit is bekend dat die insluiting van gebiedskennis in MCTS in staat is om die krag van MCTS enjins aansienlik te verbeter. ’n Suksesvolle benadering tot die voorstelling van gebiedskennis in rekenaar Go is taktiek- en patroonkenmerke met geskikte gewigte. Hierdie behels ’n aantal handgemaakte heuristieke en ’n versameling van patrone onderskeidelik. Omdat taktiekkenmerke spesifiek vir Go met die hand gemaak is, en dat patroonkenmerke Go-spesifiek is, is dit nie duidelik hoe hulle maklik oorgedra kan word na ander velde toe nie. Hierdie werk stel dus ’n nuwe verteenwoordiging van gebiedskennis voor, naamlik besluitboomkenmerke. Hierdie kenmerke evalueer ’n toestand-aksie paar deur rekursief die toevoerruimte van toestand-aksie pare te verdeel deur middel van die keuses in die besluitboom, en dan die gewig terug te keer wat ooreenstem met die verdelingselement wat die ooreenstemmende blaarnodus verteenwoordig. In hierdie werk, is besluitboomkenmerke geëvalueer op rekenaar Go, om hul lewensvatbaarheid in vergelyking met veldleidende gebruik van taktiek- en patroonkenmerke te bepaal. In hierdie toepassing van besluitboomkenmerke, verfyn elke navraag in die pad na onder van die besluitboom inligting oor die posisie rondom ’n kandidaatskuif. Die resultate van hierdie werk het getoon dat ’n kenmerkentiteit met besluitboomkenmerke ’n haalbare alternatief is vir die veldleidende gebruik van taktiek- en patroonkenmerke in rekenaar Go in terme van skuifvoorspelling as ook speelkrag, ondanks die feit dat rekenaar Go ’n relatief goedontwikkelde navorsingsgebied is. ’n Skuifvoorspellingskoers van 35.9% is behaal met taktiek- en besluitboomkenmerke, en hulle het vergelykbaar met veldleidende tegnieke presteer wanneer hulle in ’n MCTS enjin met progressiewe uitbreiding geïntegreer is. Ons lei af dat ons voorgestelde besluitboomkenmerke potensiaal toon as ’n metode vir die outomaties onttrek van gebiedskennis in nuwe velde. Hierdie eienskappe kan gebruik word om toestand-aksie pare te evalueer vir die leiding van soektog-gebaseerde tegnieke, soos MCTS, of vir aksie-voorspelling.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/86280
This item appears in the following collections: