Accelerating deep reinforcement learning for autonomous racing

Date
2023-03
Journal Title
Journal ISSN
Volume Title
Publisher
Stellenbosch : Stellenbosch University
Abstract
ENGLISH ABSTRACT: The F1/10th racing problem is to use the onboard LiDAR scan to calculate speed and steering references to move a 1/10th scale car around the track as quickly as possible. While planning has typically used perception, planning and control pipelines, recently, deep reinforcement learning (DRL) has grown in popularity due to its advantages of not requiring explicit state representation and environmental flexibility. Current approaches have suffered from poor performance at low speeds, safety concerns exacerbated by sim-toreal transfer, and few approaches have considered obstacle avoidance. The first contribution of this work is the development of high-speed learning formulations for autonomous racing. A comprehensive evaluation of previous approaches concludes that current learning formulations train agents to select infeasible speed profiles, resulting in the agents being unable to race using the vehicle’s full speed profile. This problem is overcome by using analytical vehicle models to develop learning formulations for improved speed selection. The performance evaluation shows that the novel formulations enable the vehicle to learn a feasible speed profile using the vehicle’s full speed range and achieve lower lap times than previous methods in the literature. This result indicates that using vehicle models improves high-performance racing behaviour. The second contribution of this work is to enable online learning by using a supervisory safety system (SSS). A safety system is designed that uses viability theory to ensure vehicle safety, irrespective of the planner used. The SSS is incorporated into the learning formulation and used to train DRL agents to race without them ever crashing. The novel learning formulation is extensively evaluated in simulation, demonstrating that online training can train agents to race without ever crashing, achieve a 10× improvement in sample efficiency and that the trained agents select conservative speed profiles. The proposed method is validated at constant speed on a physical vehicle, demonstrating that an agent can be trained from random to drive around a track without ever crashing. The final contribution of this work is to explore how DRL agents can be used to expand the ability of current classical planners to avoid unmapped obstacles. Three hybrid architectures that combine classical and learning components are presented and evaluated. The modification planner, which combines a path follower and DRL agent in parallel, demonstrates the ability to track a reference path while avoiding unmapped obstacles. The results indicate that combining classical and DRL components can improve the performance of DRL agents while enabling classical solutions to avoid obstacles.
AFRIKAANSE OPSOMMING: Geen opsomming beskikbaar.
Description
Thesis (PhD)--Stellenbosch University, 2023.
Keywords
Citation