About me

I’m a recent Computer Science graduate from the University of Cambridge.

I want to understand how cognitive abilities form in deep neural networks. Currently, I’m exploring how the mechanistic1 aspects of model interpretability can aid in these efforts. I also appreciate the awesome, research-driven engineering2 behind the highly-performant deep learning systems!

Please, don’t hesitate to reach out or connect! I’m very happy to hear about your research, tell you a bit about mine, or just chat about anything interesting!

Research updates

  • 2025-09-23: Activation Transport Operators will be presented as Spotlight at the Mechanistic Interpretability Workshop at NeurIPS 2025 🎉
  • 2025-08-24: Our recent work “Activation Transport Operators” on transporting SAE features across transformer layers is available! [arxiv | code]

More about me

Throughout my year-long MPhil degree, I have worked along the CaMLSys group, where I focused on Federated Learning. My dissertation explored how several institutions with limited computational resources can collaborate on training a joint foundational language model. During my studies, I also benchmarked the inner workings of torch.compile(), and explored the KV-caching strategies in LLM inference. On the more theoretical front, I looked into the phenomenon of attention sinks3 in transformers and studied the concept of dynamic tokenisation. What a year it was!

Previously, I have been working with researchers from UCL NLP on BritLLM – a joint effort towards producing freely available Large4 Language Models for UK languages5. In my undergraduate dissertation, I focused on the problem of the poor availability of LLMs for low-resource languages and worked on language model adaptation methods for African languages. Furthermore, we explored Data-Efficient Task Unlearning in LMs, a method for increasing the safety of language models and removing their undesired capabilities.

Notes

  1. mechanistic? 

  2. Awesome Engineering! 

  3. Write-up in progress. Update: It’s arrived! 

  4. As of the time of writing (2024-08-27), 3 billion parameters make the model be considered large. Update: As of 2025-07-28, a 3B model is still pretty big. 

  5. Isn’t this just… English?! Explore our work to see what other languages are spoken in the UK!Â