This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
AI Testing
Accepted at the 
AI Testing
 research sprint on 
December 19, 2022

Formal Verification for Paren-balance checking

Artificial intelligence is fast developing new capabilities and is able to interpret the context of grammatical structures and sentences correctly. However, the problem of balancing parentheses remains relevant and unsolved in the domain of NLP, which would enable more human-like capabilities. Our team approached this problem by devising a reward function based on the positional encoding of parentheses. A data set was generated by using a smaller model of GPT-2, which an NLP model was trained on using PyTorch with an average accuracy of 93%. In order to prove robustness, an attempt was made to use auto-LiRPA for formal verification. Such models can be more broadly applied to processes such as causal scrubbing–a method for rigorously testing interpretability hypotheses. Thus, we believe excellent formal verification tools for paren-balance checking could be highly impactful for groundbreaking AI alignment research in the future.

Pablo Villalobos, Edu Infante Roldán, Nguyen Tran, Heramb Podar, Alejandro Acelas
4th place
3rd place
2nd place
1st place
 by peer review