Join us for an exciting hackathon where we'll leverage Computational Mechanics to understand and control neural network behavior. This new AI Safety and interpretability approach is built upon a rigorous mathematical framework from physics and enables precise predictions about the internal geometry of AI systems, overcoming limitations of current interpretability methods.
We have $1,500 in prizes for the best projects! $750 for 1st place, $500 for 2nd place, and $250 for third place!
Collaborate with researchers and enthusiasts to stress-test the approach, create benchmarks, and develop a science of AI cognitive capabilities and risks. Contribute to the advancement of AI safety and help build a safer, more controllable AI future.
Sign up now and get ready to contribute to this new approach to AI Interpretability and Safety, and for your chance to win cash prizes!
Society is building AI systems with ever-increasing capabilities and interactions with society at large. In order to make sure our future is a healthy and safe one, we need to understand the principles and mechanisms by which these intelligent systems operate. In this hackathon we will use Computational Mechanics, a framework from physics that studies the computational structure of prediction, to do interpretability research on neural networks.
Our initial findings have found that Computational Mechanics can make precise and unintuitive predictions about the internal geometry of activations in transformers. This finding has opened up many avenues of inquiries, and there are many low hanging fruits to pick! Participation in this hackathon requires little background, and can take the form of computational experiments, theory/math, or coding/engineering. In the resources section you can find getting started guides, an open problems document, and a number of coding examples to get you started!
You will participate in teams of 1-5 people and submit a project on the entry submission page. Each project consists of multiple parts: 1) The PDF report, 2) a maximum 10-minute video overview, 3) title, summary, and descriptions.
You are allowed to think about your project and engage with the starter resources before the hackathon starts but your core research work should happen during the duration of the hackathon.
Besides these two points, this hackathon is mainly an opportunity to meaningfully engage in Computational Mechanics, Interpretability Research, and AI Safety.
Time is presented in Pacific Daylight Time (California). 10:30 PDT is 17:30 UTC.
Friday, May 31, 10:30 AM, PDT: Keynote talk with Adam Shai and Paul Riechers to inspire your projects and provide an introduction to the topic. Following the talk we will chat in the discord to discuss teams.
Saturday and Sunday, June 1 and 2, 10:30 AM, PDT: Project discussion sessions on the Discord server.
Sunday, June 2, at 17:30 PDT: Online ending session
Wednesday, June 5, at 10:30 PDT: Project presentations
Submit your entry on this page!
See submissions here.