Apart Sprints

Join our monthly hackathons and collaborate with brilliant minds worldwide on impactful AI safety research. Explore and sign up to upcoming hackathons here.

Deception Detection Hackathon: Preventing AI deception

This event concluded on
Jul 1, 2024
with
entries from
signups
Join us to develop methods to detect and measure how deceiving current AI models are. We'll hear from top researchers in the field and collaborate to build original research projects during the weekend.
Jun 28
to
Jul 1, 2024
28
Jun
28
Jun
28
Jun

Deception Detection Hackathon: Preventing AI deception

Independently organized SprintX
🚩
Virtual & Local
28
Jun
Canceled

Deception Detection Hackathon: Preventing AI deception

Deception Detection Hackathon: Preventing AI deception

Independently organized SprintX
🚩
Virtual & Local

Welcome to the Flagging AI Risks Sprint Season. From March to June 2024, Apart is hosting four research hackathons focused on catastrophic risk evaluations of AI. See the hackathons above and stay updated by signing up!

In-Person & Online

Join events on the Discord or at our in-person locations around the world! Follow the calendar here.

Live Mentorship Q&A

Our expert team will be available to help with any questions and theory on the hackathon Discord.

For Everyone

You can join in the middle of the Sprint if you don't find time and we provide code starters, ideas and inspiration; see an example.

Awards & Next Steps

We will help you take the next steps in your research journey with the Apart Lab Fellowship, providing mentorship, help with publication, funding, etc.
With partners and collaborators from
winning hackathon projects
🥈
2nd
🥉
3rd
🥇
1st

rAInboltBench : Benchmarking user location inference through single images

Le "Qronox" Lam ; Aleksandr Popov ; Jord Nguyen ; Trung Dung "mogu" Hoang ; Marcel M ; Felix Michalak
  •  
May 31, 2024
🥈
2nd
🥉
3rd
🥇
1st

Beyond Refusal: Scrubbing Hazards from Open-Source Models

Kyle Gabriel Reynoso, Ivan Enclonar, Lexley Maree Villasis
  •  
May 8, 2024
🥈
2nd
🥉
3rd
🥇
1st

EscalAtion: Assessing Multi-Agent Risks in Military Contexts

Gabriel Mukobi*, Anka Reuel*, Juan-Pablo Rivera*, Chandler Smith*
  •  
October 2, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Investigating Neuron Behaviour via Dataset Example Pruning and Local Search

Alex Foote
  •  
November 15, 2022
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

We Discovered An Neuron

Joseph Miller, Clement Neo
  •  
January 25, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Agreeableness vs. Truthfulness

  •  
October 18, 2022
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Automated Sandwiching: Efficient Self-Evaluations of Conversation-Based Scalable Oversight Techniques

Sophia Pung, Gabriel Mukobi
  •  
February 16, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Data Taxation

Joshua Sammet, Per Ivar Friborg, William Wale
  •  
July 21, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Discovering Latent Knowledge in Language Models Without Supervision - extensions and testing

Agatha Duzan, Matthieu David, Jonathan Claybrough
  •  
December 19, 2022
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Detecting Implicit Gaming through Retrospective Evaluation Sets

Jacob Haimes, Lucie Philippon, Alice Rigg, Cenny Wenner
  •  
November 27, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Evaluating Myopia in Large Language Models

Marco Bazzani, Felix Binder
  •  
September 10, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Exploring the Robustness of Model-Graded Evaluations of Language Models

Simon Lermen, Ondřej Kvapil
  •  
July 2, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

In the Mirror: Using Chess to Simulate Agency Loss in Feedback Loops

Helios Lyons
  •  
September 24, 2023
February 24, 2024
🥈
2nd
🥉
3rd
🥇
1st

Model Cards for AI Algorithm Governance

Jaime Raldua Veuthey; Gediminas Dauderis; Chetan Talele
  •  
January 7, 2024
February 24, 2024
Apart > Sprints

Apart Sprints Overview

The Apart Sprints are short hackathons and challenges focused on the most important questions in AI security. We collaborate with aligned organizations and labs across the globe.

174

Projects

1,036

Participants

50+

Sprint locations
2023's Sprints in numbers

🤗 Hack away with the best vibes

🥳 Hear from previous participants

"
This Hackathon was a perfect blend of learning, testing, and collaboration on cutting-edge AI Safety research. I really feel that I gained practical knowledge that cannot be learned only by reading articles.
"
Yoann Poupart
BlockLoads CTO
"
It was an amazing experience working with people I didn't even know before the hackathon. All three of my teammates were extremely spread out, while I am from India, my teammates were from New York and Taiwan. It was amazing how we pulled this off in 48 hours in spite of the time difference. Moreover, the mentors were extremely encouraging and supportive which helped us gain clarity whenever we got stuck and helped us create an interesting project in the end.
"
Akash Kundu
Apart Lab Fellow
"
It was great meeting such cool people to work with over the weekend! I did not know any of the other people in my group at first, and now I'm looking forward to working with them again on research projects! The organizers were also super helpful and contributed a lot to the success of our project.
"
Lucie Philippon
France Pacific Territories Economic Committee
"
The Interpretability Hackathon exceeded my expectations, it was incredibly well organized with an intelligently curated list of very helpful resources. I had a lot of fun participating and genuinely feel I was able to learn significantly more than I would have, had I spent my time elsewhere. I highly recommend these events to anyone who is interested in this sort of work!
"
Chris Mathwin
MATS scholar

⚡️ Recent research sprints

🤯 Published hackathon projects

✨ More than 200 research teams

AI Safety unionization for bottom-up governance
AI Safety Subproblems for Software Engineering Researchers
AI Safety Talent Pool Identification
Analysis of upcoming AGI companies
Authority bias to ChatGPT
ChatGPT Alignment Talent Search
Catalogue of AI safety
Critique of OpenAI's alignment plan
Diversity in AI safety
New AI organization brainstorm
Risk Defense Initiative
Simon's Time-Off Newsletter
Evaluating Myopia in Large Language Models
Marco Bazzani, Felix Binder
In the Mirror: Using Chess to Simulate Agency Loss in Feedback Loops
Helios Lyons
Preserving Agency in Reinforcement Learning under Unknown, Evolving and Under-Represented Intentions
Harry Powell, Luigi Berducci
Against Agency
Catherine Brewer
Agency as Shanon information. Unveiling limitations and common misconceptions
Ivan Madan, Hennadii Madan
Discovering Agency Features as Latent Space Directions in LLMs via SVD
max max
Comparing truthful reporting, intent alignment, agency preservation and value identification
Aksinya Bykova
Uncertainty about value naturally leads to empowerment
Filip Sondej
Agency, value and empowerment.
Benjamin Sturgeon, Leo Hyams
ILLUSION OF CONTROL
Mary Osuka