Our mission is to ensure AI systems are safe and beneficial
Our Lab publishes empirical work focused on state-of-the-art solutions for AI safety.
ResearchWe are a decentralized organization designed to solve the biggest problems in AI risk.
LabApart provides opportunities for thousands of builders to pilot AI safety research.
SprintsWith partners and co-authors from world-class partners
Apart News: Agents, Submissions & Spain
Apart News is our newsletter to keep you up-to-date.
Interpreting Learned Feedback Patterns in Large Language Models
We hypothesize that LLMs with Learned Feedback Patterns accurately aligned to the fine-tuning feedback exhibit consistent activation patterns for outputs that would have received similar feedback during RLHF.
We produce foundational research enabling the safe and beneficial development of advanced AI.
Alex Foote*, Neel Nanda, Esben Kran, Ionnis Konstas, Shay Cohen, Fazl Barez*
Albert Garde*, Esben Kran*, Fazl Barez
Michael Lan, Fazl Barez
Clement Neo*, Shay B. Cohen, Fazl Barez*
Luke Marks∗†, Amir Abdullah∗† ♢, Clement Neo†, Rauno Arike†, David Krueger□, Philip Torr‡, Fazl Barez∗‡
†Apart Research, ♢Cynch.ai, □University of Cambridge, ‡Department of Engineering Sciences, University of Oxford
Join experts and fellow researchers as we build AI safety together.
Check out the list below for ways you can interact or research with Apart!
If you have lists of AI safety and AI governance ideas that are shovel-ready lying around, submit them to aisafetyideas.com and we'll put them into the list as we make each more shovel-ready!
You can work directly with us on aisafetyideas.com, on Discord, or on Trello. If you have some specific questions, write to us here.
Send your feature ideas our way in the #features-bugs channel on Discord. We appreciate any and all feedback!
You can book a meeting here and we can talk about anything between the clouds and the dirt. We're looking forward to meeting you.
We have a design where ideas are validated by experts on the website. If you would like to be one of these experts, write to us here. It can be a huge help for the community!
The blog contains the public outreach for A*PART. Sign up for the mailing list below to get future updates.
Associate Kranc
Head of Research Department
Commanding Center Management Executive
Partner Associate Juhasz
Head of Global Research
Commanding Cross-Cultural Research Executive
Associate Soha
Commanding Research Executive
Manager of Experimental Design
Partner Associate Lækra
Head of Climate Research Associations
Research Equality- and Diversity Manager
Partner Associate Hvithammar
Honorary Fellow of Data Science and AI
P0rM Deep Fake Expert
Partner Associate Waade
Head of Free Energy Principle Modelling
London Subsidiary Manager
Partner Associate Dankvid
Partner Snus Executive
Bodily Contamination Manager
Partner Associate Nips
Head of Graphics Department
Cake Coding Expert
Associate Professor Formula T.
Honorary Associate Fellow of Research Ethics and Linguistics
Optimal Science Prediction Analyst
Partner Associate A.L.T.
Commander of the Internally Restricted CINeMa Research
Keeper of Secrets and Manager of the Internal REC
Follow the weekly updated from the Apart community and stay updated on the latest news, research, and events.