Cooperation and communication in multiagent deep reinforcement learning
Hausknecht, Matthew John
MetadataShow full item record
Reinforcement learning is the area of machine learning concerned with learning which actions to execute in an unknown environment in order to maximize cumulative reward. As agents begin to perform tasks of genuine interest to humans, they will be faced with environments too complex for humans to predetermine the correct actions using hand-designed solutions. Instead, capable learning agents will be necessary to tackle complex real-world domains. However, traditional reinforcement learning algorithms have difficulty with domains featuring 1) high-dimensional continuous state spaces, for example pixels from a camera image, 2) high-dimensional parameterized-continuous action spaces, 3) partial observability, and 4) multiple independent learning agents. We hypothesize that deep neural networks hold the key to scaling reinforcement learning towards complex tasks. This thesis seeks to answer the following two-part question: 1) How can the power of Deep Neural Networks be leveraged to extend Reinforcement Learning to complex environments featuring partial observability, high-dimensional parameterized-continuous state and action spaces, and sparse rewards? 2) How can multiple Deep Reinforcement Learning agents learn to cooperate in a multiagent setting? To address the first part of this question, this thesis explores the idea of using recurrent neural networks to combat partial observability experienced by agents in the domain of Atari 2600 video games. Next, we design a deep reinforcement learning agent capable of discovering effective policies for the parameterized-continuous action space found in the Half Field Offense simulated soccer domain. To address the second part of this question, this thesis investigates architectures and algorithms suited for cooperative multiagent learning. We demonstrate that sharing parameters and memories between deep reinforcement learning agents fosters policy similarity, which can result in cooperative behavior. Additionally, we hypothesize that communication can further aid cooperation, and we present the Grounded Semantic Network (GSN), which learns a communication protocol grounded in the observation space and reward function of the task. In general, we find that the GSN is effective on domains featuring partial observability and asymmetric information. All in all, this thesis demonstrates that reinforcement learning combined with deep neural network function approximation can produce algorithms capable of discovering effective policies for domains with partial observability, parameterized-continuous actions spaces, and sparse rewards. Additionally, we demonstrate that single agent deep reinforcement learning algorithms can be naturally extended towards cooperative multiagent tasks featuring learned communication. These results represent a non-trivial step towards extending agent-based AI towards complex environments.
Showing items related by title, author, creator and subject.
Learning with Markov logic networks : transfer learning, structure learning, and an application to Web query disambiguation Mihalkova, Lilyana Simeonova (2009-08)Traditionally, machine learning algorithms assume that training data is provided as a set of independent instances, each of which can be described as a feature vector. In contrast, many domains of interest are inherently ...
Multimedia learning: Cognitive individual differences and display design techniques predict transfer learning with multimedia learning modules Stalcup, Katherine A. A. (2005-05)Psychologists and engineers continue to debate the efficacy of technology interfaces and merit of information display approaches. In the wake of the information explosion and rapidly progressing technology, Mayer (2001) ...
Scaffolding and Enhancing Learners? Self-Regulated Learning: Testing the Effects of Online Video-Based Interactive Learning Environment on Learning Outcomes Delen, Erhan (2013-07-11)Online learning often requires learners to be self-directed and engaged, and I designed an online video-based interactive learning tool to support or scaffold students? self-regulated or self-directed learning aimed at ...