This repository contains my solutions of the assignments of the Stanford CS224N: Natural Language Processing with Deep Learning course from winter 2022/23. There are many other great repositories on this course but none that cover the latest assignments (winter 2022 / 2023) and contain the written and practical parts completely (state: Mai 12. 2023). This repository is intended as a learning ressource which provides answers if you are stuck. Please do yourself a favor and try it on your own first. If you come across any errors or like me to include a more expressive explanation, please let me know at [email protected].
Reading papers is an important part of this course and crucial for completing the assignments successfully. Therefore I recommend to have a look at How to read a Paper
Apr.18.2023
- watch Lecture 1 and Lecture 2
- read Efficient Estimation of Word Representations in Vector Space and Distributed Representations of Words and Phrases and their Compositionality
- finish assignment 1
- go through Python Review Session (slides)
Apr.19.2023
- read GloVe: Global Vectors for Word Representation, Improving Distributional Similarity with Lessons Learned from Word Embeddings and Evaluation methods for unsupervised word embeddings
- watch Lecture 3 and Lecture 4
Apr.20.2023
- read matrix calculus notes, Review of differential calculus, CS231n notes on network architectures, CS231n notes on backprop, Derivatives, Backpropagation, and Vectorization and Learning Representations by Backpropagating Errors
Apr.21.2023
- read Understanding word vectors (my own suggestion, not included in original cs224n)
- finish assignment 2 written
Apr.22.2023
- read "additional readings" A Latent Variable Model Approach to PMI-based Word Embeddings, Linear Algebraic Structure of Word Senses, with Applications to Polysemy, On the Dimensionality of Word Embedding, Yes you should understand backprop, Natural Language Processing (Almost) from Scratch
Apr.23.2023
- finish assignment 2
- watch Lecture 5 and Lecture 6
Apr.24.2023
- complete PyTorch Tutorial
- read N-gram Language Models (textbook chapter), The Unreasonable Effectiveness of Recurrent Neural Networks (blog post overview), Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.1 and 10.2), On Chomsky and the Two Cultures of Statistical Learning, Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.3, 10.5, 10.7-10.12), Learning long-term dependencies with gradient descent is difficult (one of the original vanishing gradient papers), On the difficulty of training Recurrent Neural Networks (proof of vanishing gradient problem), Vanishing Gradients Jupyter Notebook (demo for feedforward networks), Understanding LSTM Networks (blog post overview)
Apr.25.2023
- read papers from assignment 3 Adam: A Method for Stochastic Optimization, Tricks from the actual Adam update, Dropout: A Simple Way to Prevent Neural Networks from Overfitting
Apr.26.2023
- finish assignment 3 written
Apr.27.2023
- read An Explanation of Xavier Initialization,
- PyTorch documentation nn.Parameters, Initialization, Dropout, Index select, Gather, View, Flatten, Matrix product, ReLU, Adam Optimizer, Cross Entropy Loss, Optimizer Step
- finish assignment 3
Apr.28.2023
- watch Lecture 7 and Lecture 8
- read Statistical Machine Translation slides, CS224n 2015 (lectures 2/3/4), Statistical Machine Translation (book by Philipp Koehn), BLEU (original paper), Sequence to Sequence Learning with Neural Networks (original seq2seq NMT paper), Sequence Transduction with Recurrent Neural Networks (early seq2seq speech recognition paper), Neural Machine Translation by Jointly Learning to Align and Translate (original seq2seq+attention paper), Attention and Augmented Recurrent Neural Networks (blog post overview), Massive Exploration of Neural Machine Translation Architectures (practical advice for hyperparameter choices), Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models, Revisiting Character-Based Neural Machine Translation with Capacity and Compression
Mai.6.2023
-
read Embedding Layer, LSTM, LSTM Cell, Linear Layer, Dropout Layer, Conv1D Layer
-
finish assignment 4
Mai.7.2023
- finish assignment 4 written
Mai.8.2023
- watch Lecture 9
- read Attention Is All You Need, The Illustrated Transformer, Transformer (Google AI blog post), Layer Normalization, Image Transformer, Music Transformer: Generating music with long-term structure
Mai.9.2023
- watch Lecture 10 and Lecture 11
- read BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Contextual Word Representations: A Contextual Introduction, The Illustrated BERT, ELMo, and co., Martin & Jurafsky Chapter on Transfer Learning
Mai.10.2023
- watch Lecture 12
- read The Curious Case of Neural Text Degeneration, Get To The Point: Summarization with Pointer-Generator Networks, Hierarchical Neural Story Generation, How NOT To Evaluate Your Dialogue System
Mai.11.2023
- finish assignment 5
Mai.12.2023
- finish assignment 5 written