报告人:张耀宇 副教授 上海交通大学
主持人:耿新
报告时间:2025年10月31日(周五)下午14:00-15:00
报告地点:av电影 九龙湖校区计算机楼513报告厅
报告摘要:Condensation (also known as quantization, clustering, or alignment) is a widely observed phenomenon where neurons in the same layer tend to align with one another during the nonlinear training of deep neural networks (DNNs). It is a key characteristic of the feature learning process of neural networks. In recent years, to advance the mathematical understanding of condensation, we uncover structures regarding the dynamical regime, loss landscape and generalization for deep neural networks, based on which a novel theoretical framework emerges. This presentation will cover these findings in detail. First, I will present results regarding the dynamical regime identification of condensation at the infinite width limit, where small initialization is crucial. Then, I will discuss the mechanism of condensation at the initial training stage and the global loss landscape structure underlying condensation in later training stages, highlighting the prevalence of condensed critical points and global minimizers. Finally, I will present results on the quantification of condensation and its generalization advantage, which includes a novel estimate of sample complexity in the best-possible scenario. These results underscore the effectiveness of the phenomenological approach to understanding DNNs, paving a way for further developing deep learning theory.
报告人简介:张耀宇,上海交通大学自然科学研究院/数学科学av电影长聘教轨副教授。2012年于上海交通大学致远av电影获物理学士学位。2016年于上海交通大学获数学博士学位。2016年至2020年,分别在纽约大学阿布扎比分校&柯朗研究所、普林斯顿高等研究院做博士后研究。他的研究聚焦于深度学习的基础理论。

