This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
Accepted at the 
 research sprint on 
October 2, 2023

LLM Collectives in Multi-Round Interactions: Truth or Deception?

Drawing inspiration from prior research on LLMs in multi-agent settings such as debate and social deduction games, we've set up a simulation where Large Language Models (LLMs) collaboratively assess potential security breaches in an organization. Each LLM agent navigates a mixture of evidence—ranging from crucial to misleading. While individual agents access distinct information subsets, the experiment's design promotes inter-agent communication and debate. The primary objective is to evaluate if, through structured interactions, LLMs can converge on accurate conclusions. Anticipating challenges, we are particularly interested in the system's robustness against modified evidence and the influence of deceptive agents. These challenges are especially important in light of recent and numerous examples of deception in frontier AI systems (Park et al., 2023). The outcome could shed light on the intricacies and vulnerabilities of collaborative AI decision-making.

Paolo Bova, Matthew J. Lutz, Mahan Tourkaman, Anushka Deshpande, Thorbjørn Wolf
4th place
3rd place
2nd place
1st place
 by peer review