Welcome to the NLP repository! This repository contains various Jupyter Notebook files that implement fundamental functions and models related to Natural Language Processing (NLP) using numpy.
This repository focuses on implementing NLP concepts and models from scratch using numpy. It's designed to help you understand the core building blocks of neural networks and NLP techniques. The notebooks included in this repository are as follows:
-
micrograd.ipynb: This notebook contains basic functions used in neural networks, including linear matrix multiplication, non-linearities, and visualization of backpropagation using graphviz.
-
Bigram Model: The "mlp_bigram.ipynb" notebook implements a bigram model trained on English names. It includes features like Batch Normalization, dropouts, and more, built on top of the "micrograd" framework.
-
Transformer: The "shakespeare_transformer.ipynb" notebook implements a transformer model trained on concatenated Shakespearean works. It includes classes for Multi-Head Attention, Position Embeddings, and other components, built on top of the previously mentioned code.
-
spiral.ipynb: The "spiral.ipynb" notebook contains code for training a model on a spiral dataset. The code involves data preprocessing, loss computation, and gradient descent optimization. The notebook visualizes the decision boundary of the trained model.
Using this repository is straightforward. Simply read the code and run the cells in the Jupyter Notebooks. Each notebook is designed to be self-explanatory, allowing you to explore the code, experiment with the models, and gain a deeper understanding of NLP concepts.
Feel free to contribute, experiment, or adapt the code for your own projects. If you have any questions or encounter issues, please don't hesitate to reach out.
This project is licensed under the MIT License.