Skip to content

RitikaDhall/MLOps

 
 

pages-build-deployment

MLOps Repository

Overview

Welcome to the MLOps Repository! This repository is dedicated to sharing reading contents, labs and exercises for the MLOps (Machine Learning Operations) course at Northeastern University. The primary goal of this repository is to provide a centralized platform for students, instructors, and anyone interested in MLOps to access and collaborate on course-related materials. You can learn more on Machine learning topics by watching my videos on Youtube or visit my Website.

Table of Contents

Introduction

MLOps is an emerging discipline that focuses on the collaboration and communication of both data scientists and IT professionals while automating and streamlining the machine learning lifecycle. It bridges the gap between machine learning development and production deployment, ensuring that machine learning models are scalable, reproducible, and maintainable. This repository serves as a resource hub for students and instructors of Northeastern University's MLOps course.

Course Description

The MLOps course at Northeastern University is designed to provide students with a comprehensive understanding of the MLOps field. Throughout the course, students will learn how to:

  • Build end-to-end machine learning pipelines
  • Deploy machine learning models to production
  • Monitor and maintain ML systems
  • Implement CI/CD/CM/CT (Continuous Integration/Continuous Deployment/Continuous Monitoring/Continuous Training) for ML
  • Containerize and orchestrate ML workloads
  • Handle data drift and model retraining

This repository hosts the labs, code samples, and documentation related to these topics.

Labs Content

This repository offers a series of hands-on labs designed to enhance your understanding of MLOps concepts. Each lab focuses on a specific aspect of the machine learning lifecycle, providing practical experience with tools and methodologies essential for deploying and managing machine learning models in production environments.

  1. API Labs

  2. Airflow Labs

    • Objective: Gain practical experience with Apache Airflow for orchestrating complex data workflows.
    • Sub-Labs:
      • Lab 1: Basic Airflow setup and DAGs.
      • Lab 2: Advanced DAG dependencies and scheduling.
      • assets: Contains additional assets for Airflow labs.
  3. CloudFunction Labs

  4. Data Labs

    • Objective: Understand data engineering and preprocessing steps.
    • Sub-Labs:
  5. Data Storage & Warehouse Labs

    • Objective: Explore data storage solutions and data warehousing.
    • Sub-Labs:
      • Lab1: Introduction to data warehousing.
      • Lab2: Advanced data storage techniques.
      • Lab3: Optimization and data retrieval practices.
  6. Docker Container Labs

  7. ELK Labs

  8. Experiment Tracking Labs

    • Objective: Track and manage ML experiments.
    • Sub-Labs:
  9. GCP Labs

  10. GitHub Labs

  11. Kubeflow Labs

  12. MLMD Labs

    • Objective: Understand ML Metadata (MLMD) for tracking metadata.
    • Sub-Labs:
      • Lab1: Introduction to ML metadata concepts.
      • Lab2: Advanced usage and querying of ML metadata.
      • assets: Supporting materials and assets for MLMD labs.
  13. TensorFlow Labs

    • Objective: Gain hands-on experience with TensorFlow for ML model development.
    • Sub-Labs:

Each lab is accompanied by detailed instructions and code examples to facilitate hands-on learning. It's recommended to follow the labs sequentially, as concepts build upon each other. For additional resources and support, refer to the Reading Materials section of this repository.

Getting Started

To get started with the labs and exercises in this repository, please follow these steps:

  1. Clone this repository to your local machine.
  2. Navigate to the specific lab you are interested in.
  3. Read the lab instructions and review any accompanying documentation.
  4. Follow the provided code samples and examples to complete the lab exercises.
  5. Feel free to explore, modify, and experiment with the code to deepen your understanding.

For more detailed information on each lab and prerequisites, please refer to the lab's README or documentation.

Contributing

Contributions to this repository are welcome! If you are a student or instructor and would like to contribute your own labs, improvements, or corrections, please follow these guidelines:

  1. Fork this repository.
  2. Create a branch for your changes.
  3. Make your changes and commit them with clear, concise messages.
  4. Test your changes to ensure they work as expected.
  5. Submit a pull request to the main repository.

Your contributions will help improve the overall quality of the labs and benefit the entire MLOps community.

Reference:

The reading materials of this repo was collected from Coursera under the Creative Commons License.

License

This repository is open-source and is distributed under the Creative Commons License. Please review the license for more details on how you can use and share the content within this repository.

Star History

Star History Chart

🌟 Contributors

MLOPs contributors

About

Machine Learning In Production (MLOps)

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 96.5%
  • Python 3.3%
  • Other 0.2%