
Applied Mathematics Seminar——A Universal Law in Deep Learning: from MLP to Transformer
发文时间:2024-07-15
Speaker(s): Prof. Weijie Su (UPenn)
Time:2024-07-15 10:15-11:15
Venue:智华楼-四元厅-225
Abstract:
In this talk, we introduce a universal phenomenon that governs the inner workings of a wide range of neural network architectures, including multilayer perceptrons, convolutional neural networks, transformers, and Mamba. Through extensive computational experiments, we demonstrate that deep neural networks tend to process data in a uniform improvement manner across layers. We conclude this talk by discussing how this universal law provides useful insights into practice.