Applied Mathematics Seminar——A Universal Law in Deep Learning: from MLP to Transformer

发文时间:2024-07-15

Speaker(s): Prof. Weijie Su (UPenn)

Time:2024-07-15 10:15-11:15

Venue:智华楼-四元厅-225

Abstract:

In this talk, we introduce a universal phenomenon that governs the inner workings of a wide range of neural network architectures, including multilayer perceptrons, convolutional neural networks, transformers, and Mamba. Through extensive computational experiments, we demonstrate that deep neural networks tend to process data in a uniform improvement manner across layers. We conclude this talk by discussing how this universal law provides useful insights into practice.