The Foundations of AI Seminar Series is dedicated to topics of interest in artificial intelligence, machine learning, both empirically and theoretically, as well as related areas. Our goal is for these meetings to serve as a forum for discussions and quick dissemination of results. We invite anyone interested in the latest advancements in AI/ML to join us!

Next Seminar


Resurrecting Recurrent Neural Networks

Razvan Pascanu

Speaker: Razvan Pascanu (Google DeepMind, UK) Date: 26-11-2024, 2pm-3pm (BST) Location: Computer Science building, CS1.04, University of Warwick, Coventry, UK

Abstract

In this talk I will focus on State Space Models (SSMs) , a subclass of Recurrent Neural Networks (RNNs) that has recently gained some attention through works like Mamba, obtaining strong performance against transformer baselines. I will start by first explaining how SSMs can be viewed as just a particular parametrization of RNNs and what are the crucial differences compared to previous recurrent architectures that led to these results. My goal is to demystify the relative complex parametrization of the architecture and identify what elements are needed for the model to perform well. In this process I will introduce the Linear Recurrent Unit (LRU), a simplified linear layer inspired by existing SSM layers. In the second part of the talk, I will focus on language modelling and the block structure in which such layers tend to be embedded. I will argue that beyond the recurrent layer itself, the block structure borrowed from transformers plays a crucial role in the recent successes of this architecture, and present results at scale of well performing hybrid recurrent architectures as compared to strong transformer baseline. If time allows I will expand the discussion to video, presenting a few ways SSM can have an impact in video modeling. I will close the talk with a few open questions and thoughts on the importance of recurrence in modern deep learning models.


About Razvan Pascanu

Razvan Pascanu is currently a Research Scientist at DeepMind. He obtained a PhD in Computer Science from the University of Montreal (2014), supervised by Prof. Yoshua Bengio, and an MSc from Jacobs University Bremen (2009) under Prof. Herbert Jaeger. He has contributed to the development of Theano and authored deep learning tutorials for it, publishing several impactful works in deep learning and reinforcement learning. His research interests span optimization and learning, focusing on efficient and scalable optimization methods, data-efficient learning in reinforcement learning, and understanding learning dynamics. He is also deeply interested in memory mechanisms in RNNs, multi-task learning paradigms like continual learning and meta-learning, graph neural networks, and theoretical aspects of deep networks. Beyond research, he has actively contributed to the machine learning community as an organizer for EEML and AIRomania, where he has led Romanian AI Days since 2020 and developed AI courses for high school students. Additionally, he has been involved in organizing conferences such as the Lifelong Learning Conference (lifelong-ml.cc) and the Log Conference on Graph Neural Networks (logconference.org).

Upcoming Events


Speaker Image

Resurrecting Recurrent Neural Networks

Razvan Pascanu - Research Scientist, Google DeepMind, UK
Calendar Icon Nov 26, 2024 at 12:00PM
Location Icon Computer Science building, CS1.04, University of Warwick, Coventry, UK

More Info
Speaker Image

TBD

Patrick Rebeschini - Professor of Statistics and Machine Learning, University of Oxford, UK
Calendar Icon Jan 28, 2025 at 2:00PM
Location Icon Department of Computer Science, CS1.04, University of Warwick, Coventry, UK

More Info
Speaker Image

TBD

Dr. Krikamol Muandet - Chief scientist and Tenure-track Faculty (fast track) at CISPA
Calendar Icon Feb 11, 2025 at 14:00PM
Location Icon Department of Computer Science, CS1.01, University of Warwick, Coventry, UK

More Info
Speaker Image

TBD

Dr. Cuong V. Nguyen - Assistant Professor, Durham University
Calendar Icon Feb 25, 2025 at 14:00PM
Location Icon Department of Computer Science, CS1.01, University of Warwick, Coventry, UK

More Info
Speaker Image

TBD

Ilja Kuzborskij - Research Scientist, Google DeepMind
Calendar Icon Mar 04, 2025 at 14:00PM
Location Icon Department of Computer Science, CS1.04, University of Warwick, Coventry, UK

More Info
Speaker Image

TBD

Yingzen Li - Senior Lecturer, Imperial College, UK
Calendar Icon Mar 25, 2025 at 14:00PM
Location Icon Department of Computer Science, CS1.01, University of Warwick, Coventry, UK

More Info
Speaker Image

TBD

Marco Mondelli - Assistant Professor, Institute of Science and Technology, Austria
Calendar Icon Jun 03, 2025 at 14:00PM
Location Icon Mathematical Science Building, MB0.07, University of Warwick, Coventry, UK

More Info

Organising Team

Fanghui Liu

Fanghui Liu

Assistant Professor, CS Department, University of Warwick

Paris Giampouras

Paris Giampouras

Assistant Professor, CS Department, University of Warwick

Long Tran-Thanh

Long Tran-Thanh

Professor, CS Department, University of Warwick