This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Safety Benchmarks
Accepted at the 
Safety Benchmarks
 research sprint on 
July 4, 2023

From Sparse to Dense: Refining the MACHIAVELLI Benchmark for Real-World AI Safety

In this paper, we extend the MACHIAVELLI framework by incorporating sensitivity to event density, thereby enhancing the benchmark's ability to discern diverse value systems among models. This enhancement enables the identification of potential malicious actors who are prone to engaging in a rapid succession of harmful actions, distinguishing them from well-intentioned actors.

By 
Heramb Podar, Vladislav Bargatin
🏆 
4th place
3rd place
2nd place
1st place
 by peer review
Thank you! Your submission is under review.
Oops! Something went wrong while submitting the form.

This project is private