This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
ApartSprints
Evals
Accepted at the 
 research sprint on 
Accepted at the 
Evals
 research sprint on 
August 21, 2023

Can Large Language Models Solve Security Challenges?

This study focuses on the increasing capabilities of AI, especially Large Language Models (LLMs), in computer systems and coding. While current LLMs can't completely replicate uncontrollably, concerns exist about future models having this "blackbox escape" ability. The research presents an evaluation method where LLMs must tackle cybersecurity challenges involving computer interactions and bypassing security measures. Models adept at consistently overcoming these challenges are likely at risk of a blackbox escape. Among the models tested, GPT-4 performs best on simpler challenges, and more capable models tend to solve challenges consistently with fewer steps. The paper suggests including automated security challenge solving in comprehensive model capability assessments.

By 
Andrey Anurin, Ziyue Wang
🏆 
4th place
3rd place
2nd place
1st place
 by peer review