Developing basic soccer skills using reinforcement learning for the RoboCup small size league

Yoon, Moonyoung (2015-03)

Thesis (MSc)--Stellenbosch University, 2015.

Thesis

ENGLISH ABSTRACT: This study has started as part of a research project at Stellenbosch University (SU) that aims at building a team of soccer-playing robots for the RoboCup Small Size League (SSL). In the RoboCup SSL the Decision- Making Module (DMM) plays an important role for it makes all decisions for the robots in the team. This research focuses on the development of some parts of the DMM for the team at SU. A literature study showed that the DMM is typically developed in a hierarchical structure where basic soccer skills form the fundamental building blocks and high-level team behaviours are implemented using these basic soccer skills. The literature study also revealed that strategies in the DMM are usually developed using a hand-coded approach in the RoboCup SSL domain, i.e., a specific and fixed strategy is coded, while in other leagues a Machine Learning (ML) approach, Reinforcement Learning (RL) in particular, is widely used. This led to the following research objective of this thesis, namely to develop basic soccer skills using RL for the RoboCup Small Size League. A second objective of this research is to develop a simulation environment to facilitate the development of the DMM. A high-level simulator was developed and validated as a result. The temporal-difference value iteration algorithm with state-value functions was used for RL, along with a Multi-Layer Perceptron (MLP) as a function approximator. Two types of important soccer skills, namely shooting skills and passing skills were developed using the RL and MLP combination. Nine experiments were conducted to develop and evaluate these skills in various playing situations. The results showed that the learning was very effective, as the learning agent executed the shooting and passing tasks satisfactorily, and further refinement is thus possible. In conclusion, RL combined with MLP was successfully applied in this research to develop two important basic soccer skills for robots in the RoboCup SSL. These form a solid foundation for the development of a complete DMM along with the simulation environment established in this research.

AFRIKAANSE OPSOMMING: Hierdie studie het ontstaan as deel van 'n navorsingsprojek by Stellenbosch Universiteit wat daarop gemik was om 'n span sokkerrobotte vir die RoboCup Small Size League (SSL) te ontwikkel. Die besluitnemingsmodule (BM) speel 'n belangrike rol in die RoboCup SSL, aangesien dit besluite vir die robotte in die span maak. Hierdie navorsing fokus op ontwikkeling van enkele komponente van die BM vir die span by SU. 'n Literatuurstudie het getoon dat die BM tipies ontwikkel word volgens 'n hiërargiese struktuur waarin basiese sokkervaardighede die fundamentele boublokke vorm en hoëvlak spangedrag word dan gerealiseer deur hierdie basiese vaardighede te gebruik. Die literatuur het ook getoon dat strategieë in die BM van die RoboCup SSL domein gewoonlik ontwikkel word deur 'n hand-gekodeerde benadering, dit wil s^e, 'n baie spesifieke en vaste strategie word gekodeer, terwyl masjienleer (ML) en versterkingsleer (VL) wyd in ander ligas gebruik word. Dit het gelei tot die navorsingsdoelwit in hierdie tesis, naamlik om basiese sokkervaardighede vir robotte in die RoboCup SSL te ontwikkel. 'n Tweede doelwit was om 'n simulasie-omgewing te ontwikkel wat weer die ontwikkeling van die BM sou fasiliteer. Hierdie simulator is suksesvol ontwikkel en gevalideer. Die tydwaarde-verskil iterariewe algoritme met toestandwaarde-funksies is gebruik vir VL saam met 'n multi-laag perseptron (MLP) vir funksiebenaderings. Twee belangrike sokkervaardighede, naamlik doelskop- en aangeevaardighede is met hierdie kombinasie van VL en MLP ontwikkel. Nege eksperimente is uitgevoer om hierdie vaardighede in verskillende speelsituasies te ontwikkel en te evalueer. Volgens die resultate was die leerproses baie effektief, aangesien die leer-agent die doelskiet- en aangeetake bevredigend uitgevoer het, en verdere verfyning is dus moontlik. Die gevolgtrekking is dat VL gekombineer met MLP suksesvol toegepas is in hierdie navorsingswerk om twee belangrike, basiese sokkervaardighede vir robotte in die RoboCup SSL te ontwikkel. Dit vorm 'n sterk fondament vir die ontwikkeling van 'n volledige BM tesame met die simulasie-omgewing wat in hierdie werk daargestel is.

Please refer to this item in SUNScholar by using the following persistent URL: http://hdl.handle.net/10019.1/96823
This item appears in the following collections: