Learning Dynamics of Overparametrized Neural Networks

René Vidal

Speaker: René Vidal (UPenn) Date: 19-06-2024, 3pm-4pm (BST) Location: Computer Science Building, CS1.04, University of Warwick, Coventry, UK

Abstract

This talk will provide a detailed analysis of the dynamics of gradient based methods in overparameterized models. For linear networks, we show that the weights converge to an equilibrium at a linear rate that depends on the imbalance between input and output weights (which is fixed at initialization) and the margin of the initial solution. For ReLU networks, we show that the dynamics has a feature learning phase, where neurons collapse to one of the class centers, and a classifier learning phase, where the loss converges to zero at a rate 1/t.

About René Vidal

René Vidal, a global pioneer of data science, is the Rachleff University Professor, with joint appointments in the Department of Radiology in the Perelman School of Medicine and the Department of Electrical and Systems Engineering in the School of Engineering and Applied Science. Dr. Vidal has been named a Penn Integrates Knowledge University Professor at the University of Pennsylvania.

René Vidal received his B.S. degree in Electrical Engineering (highest honors) from the Pontificia Universidad Catolica de Chile in 1997 and his M.S. and Ph.D. degrees in Electrical Engineering and Computer Sciences from the University of California at Berkeley in 2000 and 2003, respectively. He was a research fellow at the National ICT Australia in 2003 and joined The Johns Hopkins University in 2004 as a faculty member in the Department of Biomedical Engineering and the Center for Imaging Science.

Next Seminar

Learning Dynamics of Overparametrized Neural Networks

Abstract

About René Vidal

Upcoming Events

Learning Dynamics of Overparametrized Neural Networks

TBD

Causal Effect Estimation with Context and Confounders

Organising Team

Fanghui Liu

Paris Giampouras

Long Tran-Thanh