Artificial intelligence will change the world. Our mission is to ensure this happens safely and to the benefit of everyone.
Finding Deception in Language Models
This June, Apart Research and Apollo Research joined forces to host the Deception Detection Hackathon, bringing together students, researchers and engineers from around the world to tackle one of the most pressing challenges in AI safety: Preventing AI from deceiving humans.
Interpreting Context Look-ups in Transformers: Investigating Attention-MLP Interactions
We probe how attention heads activate specialized "next-token" neurons in LLMs. Prompting GPT-4 reveals heads recognizing contexts tied to predicting tokens, activating neurons via residual connections. This elucidates context-dependent specialization in LLMs.
Agent Security Hackathon: Expanding our Knowledge
A weekend to research how AI safety techniques change in the face of arbitrarily complex agent architectures!
We aim to produce foundational research enabling the safe and beneficial development of advanced AI.
Luke Marks*, Amir Abdullah*, Luna Mendez, Rauno Arike, Philip Torr, Fazl Barez
Alex Foote*, Neel Nanda, Esben Kran, Ionnis Konstas, Shay Cohen, Fazl Barez*
Albert Garde*, Esben Kran*, Fazl Barez
Michael Lan, Fazl Barez
Clement Neo*, Shay B. Cohen, Fazl Barez*
Jason Hoelscher-Obermaier, Julia Persson, Esben Kran, Ionnis Konstas, Fazl Barez
Welcome to our Accelerating AI Safety Sprint Season! Here, we'll focus on building concrete solutions and tools to solve key questions in AI safety. Join our exciting hackathons to develop new research in deep tech AI safety, agent security, research tooling, and policy.
The Flagging AI Risks Sprint Season has finished and ran from March to June 2024 after four research hackathons focused on catastrophic risk evaluations of AI. A cohort of researchers has been invited to continue their research from these.
Our initiatives allow people from diverse backgrounds to have an impact on AI safety. On 7 continents, more than 250 projects have been developed by over 1,200 researchers. Some teams have gone on to publish their research at major academic venues, such as NeurIPS, ICML, and ACL.
We engage in both original research and contracting for research projects that aim to translate academic insights into actionable strategies for mitigating catastrophic risks associated with AI. We have co-authored with researchers from the University of Oxford, DeepMind, Anthropic, and more.
Our aim is to foster a positive vision and an action-focused approach to AI safety. With projects driven by fundamental innovation to the research process inspired by over 10 years of experimenting with research processes, solving large problems has become our specialization.
Check out the list below for ways you can interact or research with Apart!
If you have lists of AI safety and AI governance ideas that are shovel-ready lying around, submit them to aisafetyideas.com and we'll put them into the list as we make each more shovel-ready!
You can work directly with us on aisafetyideas.com, on Discord, or on Trello. If you have some specific questions, write to us here.
Send your feature ideas our way in the #features-bugs channel on Discord. We appreciate any and all feedback!
You can book a meeting here and we can talk about anything between the clouds and the dirt. We're looking forward to meeting you.
We have a design where ideas are validated by experts on the website. If you would like to be one of these experts, write to us here. It can be a huge help for the community!
The blog contains the public outreach for A*PART. Sign up for the mailing list below to get future updates.
Associate Kranc
Head of Research Department
Commanding Center Management Executive
Partner Associate Juhasz
Head of Global Research
Commanding Cross-Cultural Research Executive
Associate Soha
Commanding Research Executive
Manager of Experimental Design
Partner Associate Lækra
Head of Climate Research Associations
Research Equality- and Diversity Manager
Partner Associate Hvithammar
Honorary Fellow of Data Science and AI
P0rM Deep Fake Expert
Partner Associate Waade
Head of Free Energy Principle Modelling
London Subsidiary Manager
Partner Associate Dankvid
Partner Snus Executive
Bodily Contamination Manager
Partner Associate Nips
Head of Graphics Department
Cake Coding Expert
Associate Professor Formula T.
Honorary Associate Fellow of Research Ethics and Linguistics
Optimal Science Prediction Analyst
Partner Associate A.L.T.
Commander of the Internally Restricted CINeMa Research
Keeper of Secrets and Manager of the Internal REC
Follow the latest from the Apart Community and stay updated on our research and events.