stat453-deep-learning-ss21

STAT 453: Intro to Deep Learning @ UW-Madison (Spring 2021)

Part 1: Introduction

L01: Introduction to deep learning

Videos	Material
L1.0 Intro to Deep Learning, Course Introduction	L01-intro_slides.pdf
L1.1.1 Course Overview Part 1: Motivation and Topics
L1.1.2 Course Overview Part 2: Organization (Optional)
L1.2 What is Machine Learning?
L1.3.1 Broad Categories of ML Part 1: Supervised Learning
L1.3.2 Broad Categories of ML Part 2: Unsupervised Learning
L1.3.3 Broad Categories of ML Part 3: Reinforcement Learning
L1.3.4 Broad Categories of ML Part 4: Special Cases of Supervised Learning
L1.4 The Supervised Learning Workflow
L1.5 Necessary Machine Learning Notation and Jargon
L1.6 About the Practical Aspects and Tools Used in This Course	code

L02: The brief history of deep learning

Videos	Material
L2.0 A Brief History of Deep Learning -- Lecture Overview	L02_dl-history_slides.pdf
L2.1 Artificial Neurons
L2.2 Multilayer Networks
L2.3 The Origins of Deep Learning
L2.4 The Deep Learning Hardware & Software Landscape
L2.5 Current Trends in Deep Learning

L03: Single-layer neural networks: The perceptron algorithm

Videos	Material
L3.0 Perceptron Lecture Overview	L03_perceptron_slides.pdf
L3.1 About Brains and Neurons
L3.2 The Perceptron Learning Rule	code
L3.3 Vectorization in Python	code
L3.4 Perceptron in Python using NumPy and PyTorch	code
L3.5 The Geometric Intuition Behind the Perceptron

Part 2: Mathematical and computational foundations

L04: Linear algebra and calculus for deep learning

Videos	Material
L4.0 Linear Algebra for Deep Learning -- Lecture Overview	L04_linalg-dl_slides.pdf
L4.1 Tensors in Deep Learning
L4.2 Tensors in PyTorch
L4.3 Vectors, Matrices, and Broadcasting
L4.4 Notational Conventions for Neural Networks
L4.5 A Fully Connected (Linear) Layer in PyTorch

L05: Parameter optimization with gradient descent

Videos	Material
L5.0 Gradient Descent -- Lecture Overview	L05_gradient-descent_slides.pdf
L5.1 Online, Batch, and Minibatch Mode
L5.2 Relation Between Perceptron and Linear Regression
L5.3 An Iterative Training Algorithm for Linear Regression
L5.4 (Optional) Calculus Refresher I: Derivatives
L5.5 (Optional) Calculus Refresher II: Gradients
L5.6 Understanding Gradient Descent
L5.7 Training an Adaptive Linear Neuron (Adaline)
L5.8 Adaline Code Example	code

L06: Automatic differentiation with PyTorch

Videos	Material
L6.0 Automatic Differentiation in PyTorch -- Lecture Overview	L06_pytorch_slides.pdf
L6.1 Learning More About PyTorch
L6.2 Understanding Automatic Differentiation via Computation Graphs
L6.3 Automatic Differentiation in PyTorch -- Code Example	code
L6.4 Training ADALINE with PyTorch -- Code Example	code
L6.5 A Closer Look at the PyTorch API

L07: Cluster and cloud computing resources

Videos	Material
L7.0 GPU resources & Google Colab	L07_cloud-computing_slides.pdf

Part 3: Introduction to neural networks

L08: Multinomial logistic regression / Softmax regression

Videos	Material
L8.0 Logistic Regression -- Lecture Overview	L08_logistic__slides.pdf
L8.1 Logistic Regression as a Single-Layer Neural Network
L8.2 Logistic Regression Loss Function
L8.3 Logistic Regression Loss Derivative and Training
L8.4 Logits and Cross Entropy
L8.5 Logistic Regression in PyTorch -- Code Example	code
L8.6 Multinomial Logistic Regression / Softmax Regression
L8.7.1 OneHot Encoding and Multi-category Cross Entropy	code
L8.7.2 OneHot Encoding and Multi-category Cross Entropy -- Code Example	code
L8.8 Softmax Regression Derivatives for Gradient Descent
L8.9 Softmax Regression -- Code Example Using PyTorch	code

L09: Multilayer perceptrons and backpropration

Videos	Material
L9.0 Multilayer Perceptrons -- Lecture Overview	L09_mlp__slides.pdf
L9.1 Multilayer Perceptron Architecture
L9.2 Nonlinear Activation Functions	code
L9.3.1 Multilayer Perceptron -- Code Example Part 1/3 (Slide Overview)
L9.3.2 Multilayer Perceptron in PyTorch -- Code Example Part 2/3 (Jupyter Notebook)	code
L9.3.3 Multilayer Perceptron in PyTorch -- Code Example Part 3/3 (Script Setup)	code
L9.4 Overfitting and Underfitting
L9.5.1 Cats & Dogs and Custom Data Loaders
L9.5.2 Custom DataLoaders in PyTorch --Code Example	code

L10: Regularization to avoid overfitting

Videos	Material
L10.0 Regularization Methods for Neural Networks -- Lecture Overview	L10_regularization__slides.pdf
L10.1 Techniques for Reducing Overfitting
L10.2 Data Augmentation in PyTorch	code
L10.3 Early Stopping
L10.4 L2 Regularization for Neural Nets	code
L10.5.1 The Main Concept Behind Dropout
L10.5.2 Dropout Co-Adaptation Interpretation
L10.5.3 (Optional) Dropout Ensemble Interpretation
L10.5.4 Dropout in PyTorch	code

L11: Input normalization and weight initialization

Videos	Material
L11.0 Input Normalization and Weight Initialization -- Lecture Overview	L11_norm-and-init__slides.pdf
L11.1 Input Normalization
L11.2 How BatchNorm Works
L11.3 BatchNorm in PyTorch -- Code Example	code
L11.4 Why BatchNorm Works
L11.5 Weight Initialization -- Why Do We Care?
L11.6 Xavier Glorot and Kaiming He Initialization
L11.7 Weight Initialization in PyTorch -- Code Example	code

L12: Learning rates and advanced optimization algorithms

Videos	Material
L12.0: Improving Gradient Descent-based Optimization -- Lecture Overview	L12_optim__slides.pdf
L12.1 Learning Rate Decay
L12.2 Learning Rate Schedulers in PyTorch	code
L12.3 SGD with Momentum
L12.4 Adam: Combining Adaptive Learning Rates and Momentum	code
L12.5 Choosing Different Optimizers in PyTorch
L12.6 Additional Topics and Research on Optimization Algorithms

Part 4: Deep learning for computer vision and language modeling

L13: Introduction to convolutional neural networks

Videos	Material
L13.0 Introduction to Convolutional Networks -- Lecture Overview	L13_intro-cnn__slides.pdf
L13.1 Common Applications of CNNs
L13.2 Challenges of Image Classification
L13.3 Convolutional Neural Network Basics
L13.4 Convolutional Filters and Weight-Sharing
L13.5 Cross-correlation vs. Convolution (Old)	code
L13.5 What's The Difference Between Cross-Correlation And Convolution?	code
L13.6 CNNs & Backpropagation
L13.7 CNN Architectures & AlexNet
L13.8 What a CNN Can See
L13.9.1 LeNet-5 in PyTorch	code
L13.9.2 Saving and Loading Models in PyTorch	code
L13.9.3 AlexNet in PyTorch	code

L14: Convolutional neural networks architectures

Videos	Material
L14.0: Convolutional Neural Networks Architectures -- Lecture Overview	L14_cnn-architectures_slides.pdf
L14.1: Convolutions and Padding
L14.2: Spatial Dropout and BatchNorm
L14.3: Architecture Overview
L14.3.1.1 VGG16 Overview
L14.3.1.2 VGG16 in PyTorch -- Code Example	code
L14.3.2.1 ResNet Overview
L14.3.2.2 ResNet-34 in PyTorch -- Code Example	code
L14.4.1 Replacing Max-Pooling with Convolutional Layers
L14.4.2 All-Convolutional Network in PyTorch -- Code Example	code
L14.5 Convolutional Instead of Fully Connected Layers
L14.6.1 Transfer Learning
L14.6.2 Transfer Learning in PyTorch -- Code Example	code

L15: Introduction to recurrent neural networks

Videos	Material
L15.0: Introduction to Recurrent Neural Networks -- Lecture Overview	L15_intro-rnn__slides.pdf
L15.1: Different Methods for Working With Text Data
L15.2 Sequence Modeling with RNNs
L15.3 Different Types of Sequence Modeling Tasks
L15.4 Backpropagation Through Time Overview
L15.5 Long Short-Term Memory
L15.6 RNNs for Classification: A Many-to-One Word RNN	resource
L15.7 An RNN Sentiment Classifier in PyTorch	code

Part 5: Deep generative models

L16: Autoencoders

Videos	Material
L16.0 Introduction to Autoencoders -- Lecture Overview	L16_autoencoder__slides.pdf
L16.1 Dimensionality Reduction
L16.2 A Fully-Connected Autoencoder
L16.3 Convolutional Autoencoders & Transposed Convolutions
L16.4 A Convolutional Autoencoder in PyTorch -- Code Example	code
L16.5 Other Types of Autoencoders

L17: Variational autoencoders

Videos	Material
L17.0 Intro to Variational Autoencoders -- Lecture Overview	L17_vae__slides.pdf
L17.1 Variational Autoencoder Overview
L17.2 Sampling from a Variational Autoencoder
L17.3 The Log-Var Trick
L17.4 Variational Autoencoder Loss Function
L17.5 A Variational Autoencoder for Handwritten Digits in PyTorch -- Code Example	code
L17.6 A Variational Autoencoder for Face Images in PyTorch -- Code Example	code
L17.7 VAE Latent Space Arithmetic in PyTorch -- Making People Smile (Code Example)	code

L18: Introduction to generative adversarial networks

Videos	Material
L18.0: Introduction to Generative Adversarial Networks -- Lecture Overview	L18_gan__slides.pdf
L18.1: The Main Idea Behind GANs
L18.2: The GAN Objective
L18.3: Modifying the GAN Loss Function for Practical Use
L18.4: A GAN for Generating Handwritten Digits in PyTorch -- Code Example	code
L18.5: Tips and Tricks to Make GANs Work	https://github.com/soumith/ganhacks
L18.6: A DCGAN for Generating Face Images in PyTorch -- Code Example	code

L19: Self-attention and transformer networks

Videos	Material
L19.0 RNNs & Transformers for Sequence-to-Sequence Modeling -- Lecture Overview	L19_seq2seq_rnn-transformers__slides.pdf
L19.1 Sequence Generation with Word and Character RNNs
L19.2.1 Implementing a Character RNN in PyTorch (Concepts)
L19.2.2 Implementing a Character RNN in PyTorch --Code Example	code
L19.3 RNNs with an Attention Mechanism
L19.4.1 Using Attention Without the RNN -- A Basic Form of Self-Attention
L19.4.2 Self-Attention and Scaled Dot-Product Attention
L19.4.3 Multi-Head Attention
L19.5.1 The Transformer Architecture
L19.5.2.1 Some Popular Transformer Models: BERT, GPT, and BART -- Overview
L19.5.2.2 GPT-v1: Generative Pre-Trained Transformer
L19.5.2.3 BERT: Bidirectional Encoder Representations from Transformers
L19.5.2.4 GPT-v2: Language Models are Unsupervised Multitask Learners
L19.5.2.5 GPT-v3: Language Models are Few-Shot Learners
L19.5.2.6 BART: Combining Bidirectional and Auto-Regressive Transformers
L19.5.2.7: Closing Words -- The Recent Growth of Language Transformers
L19.6 DistilBert Movie Review Classifier in PyTorch -- Code Example	code

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
L01/code		L01/code
L03/code		L03/code
L05/code		L05/code
L06/code		L06/code
L08/code		L08/code
L09/code		L09/code
L10/code		L10/code
L11/code		L11/code
L12/code		L12/code
L13/code		L13/code
L14		L14
L15		L15
L16		L16
L17		L17
L18		L18
L19		L19
.gitignore		.gitignore
LICENSE		LICENSE
MNIST.zip		MNIST.zip
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stat453-deep-learning-ss21

Table of Contents

Part 1: Introduction

L01: Introduction to deep learning

L02: The brief history of deep learning

L03: Single-layer neural networks: The perceptron algorithm

Part 2: Mathematical and computational foundations

L04: Linear algebra and calculus for deep learning

L05: Parameter optimization with gradient descent

L06: Automatic differentiation with PyTorch

L07: Cluster and cloud computing resources

Part 3: Introduction to neural networks

L08: Multinomial logistic regression / Softmax regression

L09: Multilayer perceptrons and backpropration

L10: Regularization to avoid overfitting

L11: Input normalization and weight initialization

L12: Learning rates and advanced optimization algorithms

Part 4: Deep learning for computer vision and language modeling

L13: Introduction to convolutional neural networks

L14: Convolutional neural networks architectures

L15: Introduction to recurrent neural networks

Part 5: Deep generative models

L16: Autoencoders

L17: Variational autoencoders

L18: Introduction to generative adversarial networks

L19: Self-attention and transformer networks

Supplementary Resources

About

Releases

Packages

Languages

License

ksmin23/stat453-deep-learning-ss21

Folders and files

Latest commit

History

Repository files navigation

stat453-deep-learning-ss21

Table of Contents

Part 1: Introduction

L01: Introduction to deep learning

L02: The brief history of deep learning

L03: Single-layer neural networks: The perceptron algorithm

Part 2: Mathematical and computational foundations

L04: Linear algebra and calculus for deep learning

L05: Parameter optimization with gradient descent

L06: Automatic differentiation with PyTorch

L07: Cluster and cloud computing resources

Part 3: Introduction to neural networks

L08: Multinomial logistic regression / Softmax regression

L09: Multilayer perceptrons and backpropration

L10: Regularization to avoid overfitting

L11: Input normalization and weight initialization

L12: Learning rates and advanced optimization algorithms

Part 4: Deep learning for computer vision and language modeling

L13: Introduction to convolutional neural networks

L14: Convolutional neural networks architectures

L15: Introduction to recurrent neural networks

Part 5: Deep generative models

L16: Autoencoders

L17: Variational autoencoders

L18: Introduction to generative adversarial networks

L19: Self-attention and transformer networks

Supplementary Resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages