Skip to content

sammmeeeer/History-of-Deep-Learning

 
 

Repository files navigation

Speedrun Implemntation Of History Of Deep Learning 🐳

Note

... hey hi

... i'm attempting to implement around 60 important DL papers here.(almost from scratch)

... i'm doing this because i'm retarded and my notes and code might be retarded sometimes. so, be careful.

... inspired by adam-maj -> I added few more papers and few sections.

... my ml resource stack : link

Important

... as part of increasing my depth of understanding, i have build the image-captioning project with resnet+attention+lstms.

... you can find the project here


Contents

Deep Neural Networks

Optimization and Regularization

Sequence Modeling

Language Modeling

Image Generative Modeling

Deep Reinforcement Learning

Machine Learning

Papers

Deep Neural Networks

  • DNN - Learning Internal Representations by Error Propagation (1987), D. E. Rumelhart et al. [PDF]
  • CNN - Backpropagation Applied to Handwritten Zip Code Recognition (1989), Y. Lecun et al. [PDF]
  • LeNet - Gradient-Based Learning Applied to Document Recognition (1998), Y. Lecun et al. [PDF]
  • AlexNet - ImageNet Classification with Deep Convolutional Networks (2012), A. Krizhevsky et al. [PDF]
  • U-Net - U-Net: Convolutional Networks for Biomedical Image Segmentation (2015), O. Ronneberger et al. [PDF]

Optimization and Regularization

  • Weight Decay - A Simple Weight Decay Can Improve Generalization (1991), A. Krogh and J. Hertz [PDF]
  • ReLU - Deep Sparse Rectified Neural Networks (2011), X. Glorot et al. [PDF]
  • Residuals - Deep Residual Learning for Image Recognition (2015), K. He et al. [PDF]
  • Dropout - Dropout: A Simple Way to Prevent Neural Networks from Overfitting (2014), N. Strivastava et al. [PDF]
  • BatchNorm - Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (2015), S. Ioffe and C. Szegedy [PDF]
  • LayerNorm - Layer Normalization (2016), J. Lei Ba et al. [PDF]
  • GELU - Gaussian Error Linear Units (GELUs) (2016), D. Hendrycks and K. Gimpel [PDF]
  • Adam - Adam: A Method for Stochastic Optimization (2014), D. P. Kingma and J. Ba [PDF]

Sequence Modeling

  • RNN - A Learning Algorithm for Continually Running Fully Recurrent Neural Networks (1989), R. J. Williams [PDF]
  • LSTM - Long-Short Term Memory (1997), S. Hochreiter and J. Schmidhuber [PDF]
  • Learning to Forget - Learning to Forget: Continual Prediction with LSTM (2000), F. A. Gers et al. [PDF]
  • Word2Vec - Efficient Estimation of Word Representations in Vector Space (2013), T. Mikolov et al. [PDF]
  • Phrase2Vec - Distributed Representations of Words and Phrases and their Compositionality (2013), T. Mikolov et al. [PDF]
  • Encoder-Decoder - Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (2014), K. Cho et al. [PDF]
  • Seq2Seq - Sequence to Sequence Learning with Neural Networks (2014), I. Sutskever et al. [PDF]
  • Attention - Neural Machine Translation by Jointly Learning to Align and Translate (2014), D. Bahdanau et al. [PDF]
  • Mixture of Experts - Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (2017), N. Shazeer et al. [PDF]

Language Modeling

  • Transformer - Attention Is All You Need (2017), A. Vaswani et al. [PDF]
  • BERT - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018), J. Devlin et al. [PDF]
  • RoBERTa - RoBERTa: A Robustly Optimized BERT Pretraining Approach (2019), Y. Liu et al. [PDF]
  • T5 - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2019), C. Raffel et al. [PDF]
  • GPT - Improving Language Understanding by Generative Pre-Training (2018), A. Radford et al. [PDF]
  • GPT-4 - GPT-4 Technical Report (2023), OpenAI [PDF]
  • GPT-2 - Language Models are Unsupervised Multitask Learners (2018), A. Radford et al. [PDF]
  • GPT-3 - Language Models are Few-Shot Learners (2020) T. B. Brown et al. [PDF]
  • LoRA - LoRA: Low-Rank Adaptation of Large Language Models (2021), E. J. Hu et al. [PDF]
  • RLHF - Fine-Tuning Language Models From Human Preferences (2019), D. Ziegler et al. [PDF]
  • PPO - Proximal Policy Optimization Algorithms (2017), J. Schulman et al. [PDF]
  • InstructGPT - Training language models to follow instructions with human feedback (2022), L. Ouyang et al. [PDF]
  • Helpful & Harmless - Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (2022), Y. Bai et al. [PDF]
  • Vision Transformer - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020), A. Dosovitskiy et al. [PDF]
  • ELECTRA - ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (2020), K. Clark et al. [PDF]

Image Generative Modeling

  • GAN - Generative Adversarial Networks (2014), I. J. Goodfellow et al. [PDF]
  • VAE - Auto-Encoding Variational Bayes (2013), D. Kingma and M. Welling [PDF]
  • VQ VAE - Neural Discrete Representation Learning (2017), A. Oord et al. [PDF]
  • VQ VAE 2 - Generating Diverse High-Fidelity Images with VQ-VAE-2 (2019), A. Razavi et al. [PDF]
  • Diffusion - Deep Unsupervised Learning using Nonequilibrium Thermodynamics (2015), J. Sohl-Dickstein et al. [PDF]
  • Denoising Diffusion - Denoising Diffusion Probabilistic Models (2020), J. Ho. et al. [PDF]
  • Denoising Diffusion 2 - Improved Denoising Diffusion Probabilistic Models (2021), A. Nichol and P. Dhariwal [PDF]
  • Diffusion Beats GANs - Diffusion Models Beat GANs on Image Synthesis, P. Dhariwal and A. Nichol [PDF]
  • CLIP - Learning Transferable Visual Models From Natural Language Supervision (2021), A. Radford et al. [PDF]
  • DALL E - Zero-Shot Text-to-Image Generation (2021), A. Ramesh et al. [PDF]
  • DALL E 2 - Hierarchical Text-Conditional Image Generation with CLIP Latents (2022), A. Ramesh et al. [PDF]
  • SimCLR - A Simple Framework for Contrastive Learning of Visual Representations (2020), T. Chen et al. [PDF]

Deep Reinforcement Learning

  • Deep Reinforcement Learning - Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (2017), D. Silver et al. [PDF]
  • Deep Q-Learning - Playing Atari with Deep Reinforcement Learning (2013), V. Mnih et al. [PDF]
  • AlphaGo - Mastering the Game of Go with Deep Neural Networks and Tree Search (2016), D. Silver et al. [PDF]
  • AlphaFold - Highly accurate protein structure prediction with AlphaFold (2021), J. Jumper et al. [PDF]

Extraa

  • Deep Learning - Deep Learning (2015), Y. LeCun, Y. Bengio, and G. Hinton [PDF]
  • GAN - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (2016), A. Radford et al. [PDF]
  • DCGAN - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (2016), A. Radford et al. [PDF]
  • BigGAN - Large Scale GAN Training for High Fidelity Natural Image Synthesis (2018), A. Brock et al. [PDF]
  • WaveNet - WaveNet: A Generative Model for Raw Audio (2016), A. van den Oord et al. [PDF]
  • BERTology - A Survey of BERT Use Cases (2020), R. Rogers et al. [PDF]

About

learningggggggg 🐳

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%