-
Here i am trying to implement around 60 imp DL papers.
-
Why am i doing this? because i'm retarded, so my notes and code might be retarded sometimes. so, be careful.
-
Inspired by [adam-maj] -> I added few more papers and few sections.
-
Three stage of implemntation : i will implement most of model from scracth (but not all).
-
My approach is to first gather all resource learn and also i will be keep updating the repo.
-
This is actualy personal learning repo.
-
My ml resource stack : link
Concept | Complete |
---|---|
BackPropagation | ✅ |
CNN | ✅ |
AlexNet | ✅ |
U-net | ✅ |
vis-cnn | ✅ |
YOLO-v8 | ✅ |
Concept | Complete |
---|---|
weights-decay | ✅ |
relu | ✅ |
residuals | ✅ |
dropout | ✅ |
batch-norm | ✅ |
layer-norm | ✅ |
gelu | ✅ |
adam | ✅ |
early-stopping | ✅ |
Concept | Complete |
---|---|
rnn | |
lstm | |
GRU | |
learning-to-forget | |
word2vec | |
seq2seq | |
attention | |
mixture-of-experts |
Concept | Complete |
---|---|
transformer | |
bert | |
t5 | |
gpt | |
lora | |
rlhf | |
vision-transformer |
Concept | Complete |
---|---|
gans | |
vae | |
diffusion | |
clip | |
dall-e |
Concept | Complete |
---|---|
Q-learning |
Algorithm | Complete |
---|---|
Linear Regression | |
Logistic Regression | |
Decision Trees | |
Random Forest | |
Support Vector Machines | |
K-Nearest Neighbors | |
K-Means Clustering | |
Naive Bayes | |
PCA | |
Perceptron |
- DNN - Learning Internal Representations by Error Propagation (1987), D. E. Rumelhart et al. [PDF]
- CNN - Backpropagation Applied to Handwritten Zip Code Recognition (1989), Y. Lecun et al. [PDF]
- LeNet - Gradient-Based Learning Applied to Document Recognition (1998), Y. Lecun et al. [PDF]
- AlexNet - ImageNet Classification with Deep Convolutional Networks (2012), A. Krizhevsky et al. [PDF]
- U-Net - U-Net: Convolutional Networks for Biomedical Image Segmentation (2015), O. Ronneberger et al. [PDF]
- Weight Decay - A Simple Weight Decay Can Improve Generalization (1991), A. Krogh and J. Hertz [PDF]
- ReLU - Deep Sparse Rectified Neural Networks (2011), X. Glorot et al. [PDF]
- Residuals - Deep Residual Learning for Image Recognition (2015), K. He et al. [PDF]
- Dropout - Dropout: A Simple Way to Prevent Neural Networks from Overfitting (2014), N. Strivastava et al. [PDF]
- BatchNorm - Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (2015), S. Ioffe and C. Szegedy [PDF]
- LayerNorm - Layer Normalization (2016), J. Lei Ba et al. [PDF]
- GELU - Gaussian Error Linear Units (GELUs) (2016), D. Hendrycks and K. Gimpel [PDF]
- Adam - Adam: A Method for Stochastic Optimization (2014), D. P. Kingma and J. Ba [PDF]
- RNN - A Learning Algorithm for Continually Running Fully Recurrent Neural Networks (1989), R. J. Williams [PDF]
- LSTM - Long-Short Term Memory (1997), S. Hochreiter and J. Schmidhuber [PDF]
- Learning to Forget - Learning to Forget: Continual Prediction with LSTM (2000), F. A. Gers et al. [PDF]
- Word2Vec - Efficient Estimation of Word Representations in Vector Space (2013), T. Mikolov et al. [PDF]
- Phrase2Vec - Distributed Representations of Words and Phrases and their Compositionality (2013), T. Mikolov et al. [PDF]
- Encoder-Decoder - Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation (2014), K. Cho et al. [PDF]
- Seq2Seq - Sequence to Sequence Learning with Neural Networks (2014), I. Sutskever et al. [PDF]
- Attention - Neural Machine Translation by Jointly Learning to Align and Translate (2014), D. Bahdanau et al. [PDF]
- Mixture of Experts - Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (2017), N. Shazeer et al. [PDF]
- Transformer - Attention Is All You Need (2017), A. Vaswani et al. [PDF]
- BERT - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (2018), J. Devlin et al. [PDF]
- RoBERTa - RoBERTa: A Robustly Optimized BERT Pretraining Approach (2019), Y. Liu et al. [PDF]
- T5 - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2019), C. Raffel et al. [PDF]
- GPT-2 - Language Models are Unsupervised Multitask Learners (2018), A. Radford et al. [PDF]
- GPT-3 - Language Models are Few-Shot Learners (2020) T. B. Brown et al. [PDF]
- LoRA - LoRA: Low-Rank Adaptation of Large Language Models (2021), E. J. Hu et al. [PDF]
- RLHF - Fine-Tuning Language Models From Human Preferences (2019), D. Ziegler et al. [PDF]
- PPO - Proximal Policy Optimization Algorithms (2017), J. Schulman et al. [PDF]
- InstructGPT - Training language models to follow instructions with human feedback (2022), L. Ouyang et al. [PDF]
- Helpful & Harmless - Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback (2022), Y. Bai et al. [PDF]
- Vision Transformer - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (2020), A. Dosovitskiy et al. [PDF]
- GAN - Generative Adversarial Networks (2014), I. J. Goodfellow et al. [PDF]
- VAE - Auto-Encoding Variational Bayes (2013), D. Kingma and M. Welling [PDF]
- VQ VAE - Neural Discrete Representation Learning (2017), A. Oord et al. [PDF]
- VQ VAE 2 - Generating Diverse High-Fidelity Images with VQ-VAE-2 (2019), A. Razavi et al. [PDF]
- Diffusion - Deep Unsupervised Learning using Nonequilibrium Thermodynamics (2015), J. Sohl-Dickstein et al. [PDF]
- Denoising Diffusion - Denoising Diffusion Probabilistic Models (2020), J. Ho. et al. [PDF]
- Denoising Diffusion 2 - Improved Denoising Diffusion Probabilistic Models (2021), A. Nichol and P. Dhariwal [PDF]
- Diffusion Beats GANs - Diffusion Models Beat GANs on Image Synthesis, P. Dhariwal and A. Nichol [PDF]
- CLIP - Learning Transferable Visual Models From Natural Language Supervision (2021), A. Radford et al. [PDF]
- DALL E - Zero-Shot Text-to-Image Generation (2021), A. Ramesh et al. [PDF]
- DALL E 2 - Hierarchical Text-Conditional Image Generation with CLIP Latents (2022), A. Ramesh et al. [PDF]
- Deep Learning - Deep Learning (2015), Y. LeCun, Y. Bengio, and G. Hinton [PDF]
- GAN - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (2016), A. Radford et al. [PDF]
- DCGAN - Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks (2016), A. Radford et al. [PDF]
- BigGAN - Large Scale GAN Training for High Fidelity Natural Image Synthesis (2018), A. Brock et al. [PDF]
- WaveNet - WaveNet: A Generative Model for Raw Audio (2016), A. van den Oord et al. [PDF]
- BERTology - A Survey of BERT Use Cases (2020), R. Rogers et al. [PDF]
- GPT - Improving Language Understanding by Generative Pre-Training (2018), A. Radford et al. [PDF]
- GPT-4 - GPT-4 Technical Report (2023), OpenAI [PDF]
- Deep Reinforcement Learning - Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm (2017), D. Silver et al. [PDF]
- Deep Q-Learning - Playing Atari with Deep Reinforcement Learning (2013), V. Mnih et al. [PDF]
- AlphaGo - Mastering the Game of Go with Deep Neural Networks and Tree Search (2016), D. Silver et al. [PDF]
- AlphaFold - Highly accurate protein structure prediction with AlphaFold (2021), J. Jumper et al. [PDF]
- T5 - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (2019), C. Raffel et al. [PDF]
- ELECTRA - ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators (2020), K. Clark et al. [PDF]
- SimCLR - A Simple Framework for Contrastive Learning of Visual Representations (2020), T. Chen et al. [PDF]