This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
Accepted at the 
 research sprint on 
October 1, 2023

The artificial wolves of Millers Hollow

In this research, the behavior of GPT-3.5 and GPT-4 Language Model (LM) agents was explored within the game context of the Werewolves of Millers Hollow. By analyzing games with a minimal setup of 2 werewolves and 3 villagers, the study aimed to understand the agents' collaborative and deceptive capabilities. Results showed that GPT-3.5 werewolves performed significantly above random, indicating coordinated voting strategies and persuasion. Preliminary observations with GPT-4 revealed even more complex strategies, though a comprehensive review was constrained by time and budget. The study suggests that this game can be a valuable environment for further assessing LM agent behavior in intricate social simulations.

Dana Léo, Feuillade-Montixi Quentin, Tavernier Florent
4th place
3rd place
2nd place
1st place
 by peer review