About me
I’m a recent Computer Science graduate from the University of Cambridge.
I want to understand how cognitive abilities form in deep neural networks. Currently, I’m exploring how the mechanistic1 aspects of model interpretability can aid in these efforts. I also appreciate the awesome, research-driven engineering2 behind the highly-performant deep learning systems!
Please, don’t hesitate to reach out or connect! I’m very happy to hear about your research, tell you a bit about mine, or just chat about anything interesting!
Research updates
- 2025-09-23: Activation Transport Operators will be presented as Spotlight at the Mechanistic Interpretability Workshop at NeurIPS 2025 🎉
- 2025-08-24: Our recent work “Activation Transport Operators” on transporting SAE features across transformer layers is available! [arxiv | code]
More about me
Throughout my year-long MPhil degree, I have worked along the CaMLSys group, where I focused on Federated Learning. My dissertation explored how several institutions with limited computational resources can collaborate on training a joint foundational language model. During my studies, I also benchmarked the inner workings of torch.compile()
, and explored the KV-caching strategies in LLM inference. On the more theoretical front, I looked into the phenomenon of attention sinks3 in transformers and studied the concept of dynamic tokenisation. What a year it was!
Previously, I have been working with researchers from UCL NLP on BritLLM – a joint effort towards producing freely available Large4 Language Models for UK languages5. In my undergraduate dissertation, I focused on the problem of the poor availability of LLMs for low-resource languages and worked on language model adaptation methods for African languages. Furthermore, we explored Data-Efficient Task Unlearning in LMs, a method for increasing the safety of language models and removing their undesired capabilities.
Notes
Write-up in progress.Update: It’s arrived! ↩As of the time of writing (2024-08-27), 3 billion parameters make the model be considered large. Update: As of 2025-07-28, a 3B model is still pretty big. ↩
Isn’t this just… English?! Explore our work to see what other languages are spoken in the UK! ↩