- 7/26: Lecture 7 and 9 are out.
- 7/25: The report of paper review is due on 8/2, 12 pm.
- 7/19: The schedule of presentations is out.
- 7/18: A draft of Lecture 4 is out.
- 7/17: Drafts of Lecture 3, 5 and 6 are out.
- 7/12: A draft of Lecture 2 is out.
- 7/12: Some references for random feature models, Barron spaces and regularization theory of two-layer nets are added.
- 7/9: A draft of Lecture 1 is out.
- 7/9: Homework 2 is out. It is due on Tuesday, 7/16, 12pm.
- 7/6: Homework 1 is out. It is due on Friday, 7/12, 12pm.
-
Instructor:
- Weinan E
- Lei Wu, leiwu@princeton.edu
- Chao Ma, chaom@princeton.edu
-
Time: Tue: 2:00-5:00 pm; Thu: 2:00-5:00 pm; Fri: 3:00-5:00 pm.
-
Location: Room 515, Teaching Building 2
Description:
This course introduces the basic models for supervised learning, including kernel method, two-layer neural network and residual network. We then provide a unified approach to analyze these models.
Topic:
- Supverised learning, generalization/approximation/estimation error, a priroi/posteriori estimates
- Kernel method, two-layer nerual network, residual network
- Reproducing kernel Hilbert space, Barron space, compositional function space
- Rademacher complexity, margin, gradient descent, implicit regularization
Prerequisite:
- A solid background in linear algebra, real analysis and probability/measure theory
- Basic knowledge in (convex) optimization and statistics
Coursework:
- Homework (45%)
- Paper review (45%): You are asked to choose a paper from this paper list and write a review. The review should not only summarize the paper, but also identify the novelty and limitation of the result. A good paper review at least attempts to answer the following four questions:
- What is the main result of the paper?
- Why is the result important and significant compared with other papers?
- What is the limitation of the result?
- What is the potential research direction inspired by the paper?
You are required to give a presentation (15%) and submit a report of 3 pages (30%).
- Scribe notes (10%): You are asked to scribe a note in LaTeX. The scribe notes can be done in pairs. Please use this template:
Collaboration policy: We encourage you to form study groups and discuss courseworks. However, you must write up all the coureworks from scrach independently without refering to any notes from others.
- Peter Bartlett's course: Statistical Learning Theory
- MIT's course: Statistical Learning Theory
- Mohri's book: Foundations of Machine Learning
- Shai Shalev-Shwartz's book: Understanding Machine Learning: From Theory to Algorithms
- Tue 7/2: Introduction to supervised learning methods
- Thu 7/4: Overview of mathematical theory for neural network models
- Fri 7/5: Rademacher complexity, covering number, metric entropy and uniform bound
- Reproducing kernel Hilbert space and random feature model
- Error estimates for random feature model with explict and implicit regularizations
- Lecture 5
- The analysis of implicit regularization for the random feature model can be found in this paper
- Learning with SGD and Random Features
- Optimal Rates for the Regularized Least-Squares Algorithm
- Barron space and regularization theory of two-layer neural networks
- Lecture 6
- Properties of Barron space can found in Section 2 of this paper
- The a priori estimates of regularized two-layer neural networks can be found in this paper
- The must-read classic paper of Andrew Barron (This is the first paper that provides an approximation rate without the course of dimensionality.)
- Implicit regularization for two-layer neural networks
- A priori estimates for regularized deep residual networks
- F-principle and it application in deep learning (Guest speakers: Zhiqin Xu, Yaoyu Zhang, Tao Luo)
- An introduction to F-principle Lecture 9.1
- Application of F-principle in learning two-layer neural networks Lecture 9.2
- General theory of F-principle Lecture 9.3