Adapted from the course content of Deeplearning.ai's online course, Deep Learning, this repository contains implementations of several common deep neural network (DNN) architectures. The goal is to provide a comprehensive collection of well-documented and easy-to-use models for educational and research purposes. Each model includes detailed explanations and code examples to facilitate understanding and practical use.
Directory | Description |
---|---|
1. Logistic Regression | Build a logistic regression classifier from scratch, incorporating forward and backward propagation, a cost function, and gradient descent, to recognize cats. |
2. Classification with One Hidden Layer | Demonstrate the construction of a neural network with a single hidden layer for 2-class classification. The objectives include implementing non-linear activation functions like tanh, computing cross-entropy loss, and executing forward and backward propagation, highlighting the differences from a basic logistic regression model. |
3. Building Deep Neural Network | Construct a deep neural network from scratch for image classification. The objectives include implementing non-linear units like ReLU, building a network with multiple hidden layers, and developing an easy-to-use neural network class. |
4. Deep Neural Network for Image Classification | Build and train a deep L-layer neural network for a cat/not-a-cat classifier, leveraging functions from previous work to improve accuracy over earlier logistic regression models. The network is applied to supervised learning tasks. |
5. Initialization | Explore various methods for initializing neural network weights, including random, zeros, and He initialization. It demonstrates how the choice of proper initialization can accelerate convergence and improve training and generalization outcomes. |
6. Gradient Checking | Demonstrate the use of gradient checking to verify the accuracy of a backpropagation implementation, ensuring correctness and helping to identify potential bugs. |
7. Regularization | Implement regularization techniques to improve the generalization of deep learning models. These models, due to their high flexibility and capacity, are prone to overfitting, particularly with limited training data, which can lead to poor performance on new, unseen examples. |
8. Optimization Methods | Explore advanced optimization methods like Gradient Descent, Momentum, RMSProp, and Adam to accelerate learning and improve cost function results. It demonstrates how these techniques, along with random minibatches, can significantly reduce training time and enhance model performance. |
9. Convolution Model | Demonstrate how to implement convolutional and pooling layers using numpy, covering both forward and optional backward propagation. It aims to explain the convolution operation, apply various pooling methods, identify key components of convolutional neural networks (such as padding, stride, and filters), and ultimately build a convolutional neural network. |
10. Convolution Model/ Mood Classifier | Involve creating a mood classifier with TensorFlow's Keras Sequential API and a ConvNet for sign language digit recognition using the Keras Functional API. Objectives include building and training ConvNets for binary and multiclass classification problems and explaining the different use cases for the Sequential and Functional APIs. |
11. Transfer Learning with MobileNet | Explore transfer learning by adapting a pre-trained MobileNetV2 model to build a binary Alpaca classifier. MobileNetV2, known for its efficiency and trained on the ImageNet dataset, serves as the foundation for this task. |
12. Residual Networks | Construct a deep convolutional network using Residual Networks (ResNets) to tackle the challenges of training very deep networks. The objectives include implementing ResNet building blocks in Keras, combining them to develop a state-of-the-art image classification network, and integrating skip connections to enhance performance. |
13. Autonomous driving application Car detection | Implement object detection using the YOLO model, a powerful tool for real-time object detection. The objectives include detecting objects in a car detection dataset, applying non-max suppression to enhance accuracy, calculating intersection over union, and managing bounding boxes for image annotation. |
14. Image Segmentation Unet | Build a U-Net, a Convolutional Neural Network (CNN) specialized for accurate and fast image segmentation. The U-Net will be employed to perform semantic segmentation on a self-driving car dataset, predicting a label for each pixel in an image. |
15. Face Recognition | Builds a face recognition system. Many of the ideas presented here are from FaceNet. |
16. Art Generation with Neural Style Transfer | Explore Neural Style Transfer, an algorithm introduced by Gatys et al. (2015), which transforms images to blend artistic styles with content. The objectives include implementing the neural style transfer algorithm, generating unique artistic images, and defining both the style and content cost functions for this technique. |
17. Recurrent Neural Network and LSTM | Build a Recurrent Neural Network (RNN) using NumPy. Key objectives include defining the notation for sequence models, describing the architecture of a basic RNN, identifying components of an LSTM, and implementing backpropagation through time for both RNNs and LSTMs. Additionally, various types of RNNs are illustrated through examples. |
18. Language Model | Build a character-level language model to generate dinosaur names and articles in the style of Shakespeare, using patterns from a dataset. Objectives include using an RNN for text processing and generation, sampling sequences, and addressing gradient issues with gradient clipping. |
19. Improvise a Jazz Solo with an LSTM Network | Implement an LSTM-based model for generating jazz music, allowing users to listen to the created compositions. The objectives are to apply LSTM for music generation, create jazz pieces with deep learning, and use the Functional API for building complex models. |
20. Word Vectors | Explore the use of pre-trained word embeddings for analysis and modification. This guide demonstrates how embeddings represent word relationships, including steps for loading pre-trained vectors, measuring similarity using cosine similarity, and solving word analogy problems such as "Man is to Woman as King is to ______." The notebook also tackles the issue of reducing gender bias in embeddings, an important aspect of machine learning. |
21. Emojify Word Embedding | Utilizes word vector representations to create an Emojifier, which automatically generates emojis for a given sentence. For instance, it can change "Congratulations on the promotion! Let's get coffee and talk. Love you!" into "Congratulations on the promotion! 👍 Let's get coffee and talk. ☕️ Love you! ❤️" |
22. Translation with attention | Build a Neural Machine Translation (NMT) model to convert human-readable dates (e.g., "25th of June, 2009") into machine-readable formats (e.g., "2009-06-25"). The goal is to leverage an attention model, a refined sequence-to-sequence approach, to facilitate this translation. |
23. Trigger Word Detection | Apply deep learning to speech recognition by developing a trigger word detection system. The goal is to create a speech dataset and implement an algorithm to detect a specific word—"activate"—which will prompt a "chiming" sound when recognized. |
24. Transformer | Explore the Transformer architecture, which speeds up training and excels in handling sequential data for NLP. Introduced in the 2017 paper "Attention Is All You Need," Transformers are essential for modern NLP models. Objectives include developing positional encodings, computing self-attention, implementing masked multi-head attention, and training a Transformer model. |
25. Embedding & Positional Encoding | Examine pre-processing methods for raw text before it enters the encoder and decoder blocks of the Transformer architecture. Objectives include creating visualizations to understand positional encodings and demonstrating their impact on word embeddings. |
26. Transformer Application NER | Explore using the previously built Transformer architecture for Named-Entity Recognition (NER). Objectives include employing tokenizers and pre-trained models from the HuggingFace Library and fine-tuning a Transformer model specifically for NER tasks. |
27. Transformer Application QA | This project explores applying the previously built Transformer architecture to Question Answering (QA), a core function of Large Language Models like GPT-4. Objectives include performing extractive QA, fine-tuning a pre-trained Transformer model on a custom dataset, and implementing the QA model in both TensorFlow and PyTorch. |