机器学习与数据科学博士生系列论坛(第四十一期)—— To Understand Neural Networks with Scale Invariant Architectures

发文时间:2022-12-05

Speaker(s):Kun Chen (PKU)

Time:2022-12-05 16:00-17:00

Venue:腾讯会议 723 1564 5542

摘要: 
While deep learning has achieved great success in practice, its theory is still mysterious to us. Experience shows that when adopting the regularization layer (e.g., Batch Normalization, Layer Normalization), the performance of neural networks in practice will be much better. From a theoretical point of view, the regularization layers make the neural networks scale invariant, which is a helpful property for the understanding of the optimization, especially the learning rate. With the convenience of the scale invariant and effective learning rate, we can partly explain some phenomena in deep learning.

In this talk, we will briefly introduce the scale invariant architectures, as well as the effective learning rate it brings out. We will further explain how it helps explain several phenomena in deep learning. Some works based on the effective learning rate itself will also be discussed.