This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
Accepted at the 
 research sprint on 
August 21, 2023

SADDER - Situational Awareness Dataset for Detecting Extreme Risks

We create a benchmark for detecting two types of situational awareness (train/test distinguishing ability, and ability to reason about how it can and can't influence the world) that we believe are important for assessing threats from advanced AI systems, and measure the performance of several LLMs on this (GPT-4, Claude, and several GPT-3.5 variants).

Rudolf Laine, Alex Meinke
4th place
3rd place
2nd place
1st place
 by peer review