A recent line of research on deep learning shows that the training of extremely wide neural networks can be characterized by a kernel function called neural tangent kernel (NTK). However, it is known that this type of result does not perfectly match the practice, as NTK-based analysis requires the network weights to stay very close to their initialization throughout training, and cannot handle regularizers or gradient noises. In this talk, I will present a generalized neural tangent kernel analysis and show that noisy gradient descent with weight decay can still exhibit a ``kernel-like'' behavior. This implies that the training loss converges linearly up to a certain accuracy. I will also discuss the generalization error of an infinitely wide two-layer neural network trained by noisy gradient descent with weight decay.
14 Aug 2020
11am - 12pm
Where
https://hkust.zoom.us/j/5616960008
Speakers/Performers
Dr. Yuan CAO
UCLA
Organizer(S)
Department of Mathematics
Contact/Enquiries
mathseminar@ust.hk
Payment Details
Audience
Alumni, Faculty and Staff, PG Students, UG Students
Language(s)
English
Other Events
21 Jun 2024
Seminar, Lecture, Talk
IAS / School of Science Joint Lecture - Alzheimer’s Disease is Likely a Lipid-disorder Complication: an Example of Functional Lipidomics for Biomedical and Biological Research
Abstract Functional lipidomics is a frontier in lipidomics research, which identifies changes of cellular lipidomes in disease by lipidomics, uncovers the molecular mechanism(s) leading to the chan...
24 May 2024
Seminar, Lecture, Talk
IAS / School of Science Joint Lecture - Confinement Controlled Electrochemistry: Nanopore beyond Sequencing
Abstract Nanopore electrochemistry refers to the promising measurement science based on elaborate pore structures, which offers a well-defined geometric confined space to adopt and characterize sin...