Awesome papers in machine learning theory & deep learning theory.
Understanding Machine Learning: From Theory to Algorithms.
- Year 2014.
- Shai Shalev-Shwartz, Shai Ben-David.
- book
Foundations of Machine Learning.
- Year 2018.
- Mehryar Mohri, Afshin Rostamizadeh, and Ameet Talwalkar.
- book
Learning Theory from First Principles.
- Year 2021.
- Francis Bach.
- book
Learnability and the Vapnik-Chervonenkis Dimension.
- Journal of the Association for Computing Machinery 1989.
- Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, Manfred M. Warmuth.
- paper
Sample Compression, Learnability, and the Vapnik-Chervonenkis Dimension.
- Machine Learning 1995.
- Sally Floyd, Manfred Warmuth.
- paper
Characterizations of Learnability for Classes of
-valued Functions. - Journal of Computer and System Sciences 1995.
- Shai Ben-David, Nicolo Cesa-Bianchi, David Haussler, Philip M. Long.
- paper
Scale-Sensitive Dimensions, Uniform Convergence, and Learnability.
- JACM 1997.
- Noga Alon, Shai Ben-David, Nicolo Cesa-Bianchi, David Haussler.
- paper
Regret Bounds for Prediction Problems.
- COLT 1999.
- Geoffrey J. Gordon.
- paper
A Study About Algorithmic Stability and Their Relation to Generalization Performances.
- Technical Report 2000.
- Andre Elissee.
- paper
Algorithmic Stability and Generalization Performance.
- NIPS 2001.
- Olivier Bousquet, Andre Elisseeff.
- paper
A Generalized Representer Theorem.
- COLT 2001.
- Bernhard Scholkopf, Ralf Herbrich, Alex J. Smola.
- paper
Concentration Inequalities and Empirical Processes Theory Applied to the Analysis of Learning Algorithms.
- Phd Thesis 2002.
- Olivier Bousquet.
- paper
Rademacher and Gaussian Complexities: Risk Bounds and Structural Results.
- JMLR 2002.
- Peter L. Bartlett, Shahar Mendelson.
- paper
Stability and Generalization.
- JMLR 2002.
- Olivier Bousquet, Andre Elisseeff.
- paper
Almost-Everywhere Algorithmic Stability and Generalization Error.
- UAI 2002.
- Samuel Kutin, Partha Niyogi.
- paper
PAC-Bayes & Margins.
- NIPS 2003.
- John Langford, John Shawe-Taylor.
- paper
Statistical Behavior and Consistency of Classification Methods based on Convex Risk Minimization.
- Annals of Statistics 2004.
- Tong Zhang.
- paper
Theory of Classification: A Survey of Some Recent Advances.
- ESAIM: Probability and Statistics 2005.
- Stephane Boucheron, Olivier Bousquet, Gabor Lugosi.
- paper
Learning Theory: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization.
- Advances in Computational Mathematics 2006.
- Sayan Mukherjee, Partha Niyogi, Tomaso Poggio, Ryan Rifkin.
- paper
Tutorial on Practical Prediction Theory for Classification.
- JMLR 2006.
- John Langford.
- paper
Rademacher Complexity Bounds for Non-I.I.D. Processes.
- NIPS 2008.
- Mehryar Mohri, Afshin Rostamizadeh.
- paper
On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization.
- NIPS 2008.
- Sham M. Kakade, Karthik Sridharan, and Ambuj Tewari.
- paper
Agnostic Online Learning.
- COLT 2009.
- Shai Ben-David, David Pal, Shai Shalev-Shwartz.
- paper
Learnability, Stability and Uniform Convergence.
- JMLR 2010.
- Shai Shalev-Shwartz, Ohad Shamir, Nathan Srebro, Karthik Sridharan.
- paper
Multiclass Learnability and the ERM Principle.
- COLT 2011.
- Amit Daniely, Sivan Sabato, Shai Ben-David, Shai Shalev-Shwartz.
- paper
Algorithmic Stability and Hypothesis Complexity.
- ICML 2017.
- Tongliang Liu, Gábor Lugosi, Gergely Neu, Dacheng Tao.
- paper
Stability and Generalization of Learning Algorithms that Converge to Global Optima.
- ICML 2018.
- Zachary Charles, Dimitris Papailiopoulos.
- paper
Generalization Bounds for Uniformly Stable Algorithms.
- NIPS 2018.
- Vitaly Feldman, Jan Vondrak.
- paper
Reconciling Modern Machine Learning Practice and the Bias-Variance Trade-Off.
- arXiv 2019.
- Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal.
- paper
Sharper Bounds for Uniformly Stable Algorithms.
- COLT 2020.
- Olivier Bousquet, Yegor Klochkov, Nikita Zhivotovskiy.
- paper
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures.
- NeurIPS 2021.
- Yuan Cao, Quanquan Gu, Mikhail Belkin.
- paper
- The Modern Mathematics of Deep Learning.
- arXiv 2021.
- Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen.
- book
Deep Learning Theory Review: An Optimal Control and Dynamical Systems Perspective.
- arXiv 2019.
- Guan-Horng Liu, Evangelos A. Theodorou.
- paper
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation.
- arXiv 2019.
- Greg Yang
- paper
Regularization Algorithms for Learning that are Equivalent to Multilayer Networks.
- Science 1990.
- T. Poggio, F. Girosi.
- paper
Strong Universal Consistency of Neural Network Classifiers.
- IEEE Transactions on Information Theory 1993.
- AndrAs Farag, GAbor Lugosi.
- paper
For Valid Generalization, the Size of the Weights is More Important Than the Size of the Network.
- NIPS 1996.
- Peter L. Bartlett.
- paper
Benefits of Depth in Neural Networks.
- COLT 2016.
- Matus Telgarsky.
- paper
Understanding Deep Learning Requires Rethinking Generalization.
- ICLR 2017.
- Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals.
- paper
Convergence Analysis of Two-layer Neural Networks with ReLU Activation.
- arXiv 2017.
- Yuanzhi Li, Yang Yuan.
- paper
Neural Tangent Kernel: Convergence and Generalization in Neural Networks.
- NeurIPS 2018.
- Arthur Jacot, Franck Gabriel, Clément Hongler.
- paper
PAC-Bayesian Margin Bounds for Convolutional Neural Networks.
To Understand Deep Learning We Need to Understand Kernel Learning.
- ICML 2018.
- Mikhail Belkin, Siyuan Ma, Soumik Mandal.
- paper
The Vapnik–Chervonenkis Dimension of Graph and Recursive Neural Networks.
- ML 2018.
- Franco Scarselli, Ah Chung Tsoi, Markus Hagenbuchner.
- paper
Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks.
- arXiv 2019.
- Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma.
- paper
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers.
- NeurIPS 2019.
- Zeyuan Allen-Zhu, Yuanzhi Li, Yingyu Liang.
- paper
Deep Learning Generalizes Because the Parameter-Function Map is Biased Towards Simple Functions.
- ICLR 2019.
- Guillermo Valle Pérez, Chico Q. Camargo, Ard A. Louis.
- paper
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent.
- arXiv 2020.
- David Holzmuller, Ingo Steinwart.
- paper
On the Distance Between Two Neural Networks and the Stability of Learning.
- Neurips 2021.
- Jeremy Bernstein, Arash Vahdat, Yisong Yue, Ming-Yu Liu.
- paper
Generalization Performance of Empirical Risk Minimization on Over-parameterized Deep ReLU Nets.
- arXiv 2021.
- Shao-Bo Lin, Yao Wang, Ding-Xuan Zhou.
- paper
Learnability of Convolutional Neural Networks for Infinite Dimensional Input via Mixed and Anisotropic Smoothness.
- ICLR 2022.
- Sho Okumoto, Taiji Suzuki.
- paper