| 
            
              
                | 
                    Euan Ong
                   I've just finished my undergraduate degree in Computer Science at the University of Cambridge (where I ranked 1st / ~120 every year). I'm currently working on adversarial robustness at Anthropic for a year, before starting a Ph.D. in 'abstractions-first' mechanistic interpretability at MIT with Jacob Andreas and Armando Solar-Lezama.
                   
                    My ambition is to develop powerful, yet safe and interpretable
                    abstract reasoners, whose
                    internal
                    state and behaviour remain transparent to the end
                      user.
                   
                    To this end, I'm particularly interested in exploring how the mathematical toolkits we use to
                    understand and structure programs ‒ such as formal methods,
                    types and category theory ‒ can
                    inspire new ways
                    to both reverse-engineer existing neural networks, and build scalable neurosymbolic systems.
                   
                    Email  / 
                    CV  / 
                    
                    Google Scholar
                     / 
                    Twitter  / 
                    LinkedIn  / 
                    Github
                   |   |  
            
              
                | Research
                    So far, my research has broadly focused on studying the behaviour of neural networks in
                      vitro:
                    understanding both how they generalise when learning to perform abstract tasks, and
                    what
                    this tells us about the algorithms they've learned in order to do so.
                   
                    Previously, I've probed the foundations of neural
                      algorithmic
                      reasoning, explored attacks on vision-language models, and poked language model
                    representations
                    with a stick.
                   |  
            
              
                | Published work |  
                |   | Probing the Foundations of Neural Algorithmic Reasoning Euan Ong
 Technical Report, 2023; ICML Differentiable Almost Everything (Spotlight), 2024
 abstract
                  /
                  full text
                  /
                  project page
 
                    
                    I explored a fundamental claim of neural algorithmic reasoning, and found evidence to refute it through statistically robust ablations. Based on my observations, I developed a way to parallelise differentiable algorithms that preserves their efficiency and correctness guarantees while alleviating their performance bottlenecks. This work formed part of my Bachelor's thesis, which won the CS department's Best Dissertation Award. |  
                |   | Successor Heads: Recurring, Interpretable Attention Heads In The Wild Rhys Gould, 
                  Euan Ong, 
                  George Ogden,
                  Arthur Conmy
 ICLR, 2024;
                  NeurIPS ATTRIB (Oral), 2023
 arXiv
                  /
                  reviews
                  /
                  project page
                  /
                  tweeprint
 We discovered successor heads: attention heads present in a range of LLMs that increment tokens from ordinal sequences (e.g. numbers, months and days). We isolated a common numeric subspace within embedding space, that for any given token (e.g. 'February'), encodes the index of that token within its ordinal sequence (e.g. months). We also found that numeric token representations can be decomposed into interpretable features representing the value of the token mod 10, which can be used to edit the numeric value of the representation via vector arithmetic.
                   |  
                |   | Image Hijacks: Adversarial Images can Control Generative Models at
                      Runtime Luke Bailey*, Euan Ong*, Stuart Russell, Scott Emmons (* denotes equal contribution)
 ICML, 2024
 arXiv
                  /
                  project page + demo
                  /
                  tweeprint
 We discovered that adversarial images can hijack the behaviour of vision-language models (VLMs) at
                    runtime. We developed a general method for crafting these image hijacks,
                    and trained image hijacks forcing VLMs to output arbitrary text, leak their context window and
                    comply with harmful instructions. We also derived an algorithm to train hijacks forcing 
                    VLMs to behave as though they were given an arbitrary prompt, which we used to make them believe the Eiffel Tower is in Rome.
                   |  
                |   | Learnable Commutative Monoids for Graph Neural Networks Euan Ong, Petar Veličković
 Learning on Graphs Conference, 2022
 arXiv
                  /
                  reviews
                  /
                  project page
                  /
                  tweeprint
 Using ideas from abstract algebra and functional programming, we built a new GNN aggregator that
                    beats the state of the art on complex aggregation problems (especially out-of-distribution), while
                    remaining efficient and parallelisable on large graphs. |  
                | Other projects |  
                |   | Personality Machine Euan Ong*, 
                  Jamie Chen*,
                  Kyra Zhou*,
                  Marcus Handley*,
                  Mingle Chen*,
                  Ori Vasilescu*,
                  Eleanor Drage
                  (* denotes equal contribution)
 CST Group Project, 2022
 
 In collaboration with the Centre for Gender Studies at Cambridge, we built a tool highlighting the questionable logic behind the use of AI-driven personality assessments often used in hiring. Our tool demonstrates how arbitrary changes in facial expression, clothing, lighting and background can give radically different personality readings, and was featured in the BBC and the Telegraph.
                   |  
                |   | Dissecting Deep Learning for Systematic Generalisation Euan Ong, Etaash
                    Katiyar, Kai-En Chong, Albert
                    Qiaochu Jiang
 Informal research, 2021
 
 We investigated the capabilities of transformers to systematically generalise when learning to
                    recognise formal languages (such as
                    Parity and 2-Dyck), empirically corroborating various theoretical claims about transformer generalisation.
                    Inspired by our observations, we
                    derived a parallel, stackless algorithm for recognising 2-Dyck that could (in principle) be
                    implemented by a transformer with a constant number of attention layers.
                   |  
                |   | Object Detection in Thermal Imagery via Convolutional Neural
                      Networks Euan Ong, 
                  Niki Trigoni, 
                  Pedro Porto Barque de Gusmão
 Technical report, 2019
 
 
                    We trained a Faster R-CNN object detection network to identify landmarks (e.g. doors and windows) in
                    thermal images of indoor environments, with applications in the development of navigational aids for
                    search and rescue operations.
                   |  |