Collaborative Museum Heist with Reinforcement Learning

Published in Computer Animation & Virtual Worlds, Volume 34, Issue 3–4, May 2023.

Presented at 36th International Conference on Computer Animation and Social Agents, CASA'23, May 2023.

Eleni Evripidou¹, Andreas Aristidou^1,2, Panayiotis Charalambous^1,2

¹University of Cyprus

²CYENS Centre of Excellence

Overview

In this paper, we present our initial findings of applying Reinforcement Learning techniques to a museum heist game, where trained robbers with different skills learn to cooperate and maximize individual and team rewards while avoiding detection by scripted security guards and cameras, showcasing the feasibility of training both sides concurrently in an adversarial game setting.

Abstract

Non-playable characters (NPCs) play a crucial role in enhancing immersion in video games. However, traditional NPC behaviors are often hard-coded using methods such as Finite State Machines, Decision and Behavior trees. This has a few limitations; namely, it is quite difficult to implement complex cooperative behaviors and secondly this makes it easy for human players to identify and exploit patterns in behavior. To overcome these challenges, Reinforcement learning (RL) can be used to generate dynamic and real-time NPC responses to human player actions. In this paper, we report on first results of applying RL techniques to a Non-Zero Sum, adversarial asymmetric game, using a multi-agent team. The game environment simulates a museum heist, where the objective of the successfully trained team of robbers with different skills (Locksmith, Technician) is to steal valuable items from the museum without being detected by the scripted security guards and cameras. Both agents were trained concurrently with separate policies and received both individual and group reward signals. Through this training process, the agents learned to cooperate effectively and use their skills to maximize both individual and team benefits. These results demonstrate the feasibility of realizing the full game where both robbers and security guards are trained at the same time to achieve their adversarial goals.

Game Environment & Sensors

Museum Heist environment and ray sensors

Ray Sensors. Starting from top left: back (pink), front (red), security camera (blue), robbers & guards (green), and valuables (yellow).

Video

BibTeX

@article{Evripidou:2023,
 author    	= {Evripidou, Eleni and Aristidou, Andreas and Charalambous, Panayiotis},
 title     	= {Collaborative museum heist with reinforcement learning}, 
 journal   	= {Computer Animation and Virtual Worlds}, 
 issue_date	= {May 2023},
 volume    	= {34},
 number    	= {3-4},
 month     	= {may},
 pages		= {e2158},
 doi 		= {10.1002/cav.2158},
 publisher 	= {Wiley},
 address   	= {},
 year      	= {2023}
}

Acknowledgments

This project has received funding from the European Union's Horizon 2020 Research and Innovation Programme under Grant Agreement 739578 and the Government of the Republic of Cyprus through the Deputy Ministry of Research, Innovation and Digital Policy; and internal funds from the University of Cyprus (project: Demonstration).

Supported by