Skip to content

๐Ÿค– ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป for ๐—ณ๐—ฟ๐—ฒ๐—ฒ how to ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ an end-to-end ๐—ฝ๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป-๐—ฟ๐—ฒ๐—ฎ๐—ฑ๐˜† ๐—Ÿ๐—Ÿ๐—  & ๐—ฅ๐—”๐—š ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ using ๐—Ÿ๐—Ÿ๐— ๐—ข๐—ฝ๐˜€ best practices: ~ ๐˜ด๐˜ฐ๐˜ถ๐˜ณ๐˜ค๐˜ฆ ๐˜ค๐˜ฐ๐˜ฅ๐˜ฆ + 11 ๐˜ฉ๐˜ข๐˜ฏ๐˜ฅ๐˜ด-๐˜ฐ๐˜ฏ ๐˜ญ๐˜ฆ๐˜ด๐˜ด๐˜ฐ๐˜ฏ๐˜ด

License

Notifications You must be signed in to change notification settings

MekongDelta-mind/llm-twin-course

ย 
ย 

Repository files navigation

LLM Twin Course: Building Your Production-Ready AI Replica

Learn to architect and implement a production-ready LLM & RAG system by building your LLM Twin

From data gathering to productionizing LLMs using LLMOps good practices.

by Decoding ML

Your image description

Why is this course different?

By finishing the "LLM Twin: Building Your Production-Ready AI Replica" free course, you will learn how to design, train, and deploy a production-ready LLM twin of yourself powered by LLMs, vector DBs, and LLMOps good practices.

Why should you care? ๐Ÿซต

โ†’ No more isolated scripts or Notebooks! Learn production ML by building and deploying an end-to-end production-grade LLM system.

What will you learn to build by the end of thisย course?

You will learn how to architect and build a real-world LLM system from start to finishโ€Š-โ€Šfrom data collection to deployment.

You will also learn to leverage MLOps best practices, such as experiment trackers, model registries, prompt monitoring, and versioning.

The end goal? Build and deploy your own LLM twin.

What is an LLM Twin? It is an AI character that learns to write like somebody by incorporating its style and personality into an LLM.

Table of contents

The architecture of the LLM twin is split into 4 Python microservices:

LLM Twin Architecture

The data collection pipeline

  • Crawl your digital data from various social media platforms.
  • Clean, normalize and load the data to a Mongo NoSQL DB through a series of ETL pipelines.
  • Send database changes to a RabbitMQ queue using the CDC pattern.
  • โ˜๏ธ Deployed on AWS.

The feature pipeline

  • Consume messages from a queue through a Bytewax streaming pipeline.
  • Every message will be cleaned, chunked, embedded and loaded into a Qdrant vector DB in real-time.
  • In the bonus series, we refactor the cleaning, chunking, and embedding logic using Superlinked, a specialized vector compute engine. We will also load and index the vectors to Redis vector search.
  • โ˜๏ธ Deployed on AWS.

The training pipeline

  • Create a custom dataset based on your digital data.
  • Fine-tune an LLM using QLoRA.
  • Use Comet ML's experiment tracker to monitor the experiments.
  • Evaluate and save the best model to Comet's model registry.
  • โ˜๏ธ Deployed on AWS SageMaker

The inference pipeline

  • Load the fine-tuned LLM from Comet's model registry.
  • Deploy it as a REST API.
  • Enhance the prompts using advanced RAG.
  • Generate content using your LLM twin.
  • Monitor the LLM using Comet's prompt monitoring dashboard.
  • In the bonus series, we refactor the advanced RAG layer to write more optimal queries using Superlinked.
  • โ˜๏ธ Deployed on AWS SagaMaker
  • Wrap up everything with a Gradio UI (as seen below) where you can start playing around with the LLM Twin.

Gradio UI

Along the 4 microservices, you will learn to integrate 4 serverless tools:

Who is thisย for?

Audience: MLE, DE, DS, or SWE who want to learn to engineer production-ready LLM systems using LLMOps good principles.

Level: intermediate

Prerequisites: basic knowledge of Python, ML, and the cloud

How will youย learn?

The course contains 11 hands-on written lessons and the open-source code you can access on GitHub.

You can read everything and try out the code at your own pace.ย 

Costs?

The articles and code are completely free. They will always remain free.

If you plan to run the code while reading it, you have to know that we use several cloud tools that might generate additional costs.

Pay as you go

  • AWS offers accessible plans to new joiners.
    • For a new first-time account, you could get up to 300$ in free credits which are valid for 6 months. For more, consult the AWS Offerings page.

Freemium (Free-of-Charge)

Questions and troubleshooting

Please ask us any questions if anything gets confusing while studying the articles or running the code.

You can ask any question by opening an issue in this GitHub repository by clicking here.

Lessons

โ†’ Quick overview of each lesson of the LLM Twin free course.

Important

To understand the entire code step-by-step, check out our articles โ†“

The course is split into 12 lessons. Every Medium article represents an independent lesson.

The lessons are NOT 1:1 with the folder structure!

System design

  1. An End-to-End Framework for Production-Ready LLM Systems by Building Your LLM Twin

Data engineering: Gathering and storing the data for your LLM Twin

  1. Your Content is Gold: I Turned 3 Years of Blog Posts into an LLM Training
  2. I Replaced 1000 Lines of Polling Code with 50 Lines of CDC Magic

Feature pipeline: Feature engineering data for LLM fine-tuning & RAG

  1. SOTA Python Streaming Pipelines for Fine-tuning LLMs and RAG โ€” in Real-Time!
  2. The 4 Advanced RAG Algorithms You Must Know to Implement

Training pipeline: Fine-tuning your LLM Twin

  1. Turning Raw Data Into Fine-Tuning Datasets
  2. 8B Parameters, 1 GPU, No Problems: The Ultimate LLM Fine-tuning Pipeline
  3. The Engineerโ€™s Framework for LLM & RAG Evaluation

Inference pipeline: Serving and monitoring your LLM Twin

  1. Beyond Proof of Concept: Building RAG Systems That Scale
  2. Prompt monitoring WIP

Bonus: Refactoring and optimizing the RAG system

  1. Build a scalable RAG ingestion pipeline using 74.3% less code
  2. Build Multi-Index Advanced RAG Apps

Install & Usage

To understand how to install and run the LLM Twin code, go to the INSTALL_AND_USAGE dedicated document.

Note

Even though you can run everything solely using the INSTALL_AND_USAGE dedicated document, we recommend that you read the articles to understand the LLM Twin system and design choices fully.

Bonus Superlinked series

The bonus Superlinked series has an extra dedicated README that you can access under the 6-bonus-superlinked-rag directory.

In that section, we explain how to run it with the improved RAG layer powered by Superlinked.

Meet your teachers!

The course is created under the Decoding ML umbrella by:

Paul Iusztin
Senior AI & LLM Engineer

License

This course is an open-source project released under the MIT license. Thus, as long you distribute our LICENSE and acknowledge our work, you can safely clone or fork this project and use it as a source of inspiration for whatever you want (e.g., university projects, college degree projects, personal projects, etc.).

Contributors

A big "Thank you ๐Ÿ™" to all our contributors! This course is possible only because of their efforts.

Sponsors

Also, another big "Thank you ๐Ÿ™" to all our sponsors who supported our work and made this course possible.

Comet Opik Bytewax Qdrant Superlinked
Comet Opik Bytewax Qdrant Superlinked

Next Steps

Our LLM Engineerโ€™s Handbook inspired the open-source LLM Twin course.

Consider supporting our work by getting our book to learn a complete framework for building and deploying production LLM & RAG systems โ€” from data to deployment.

Perfect for practitioners who want both theory and hands-on expertise by connecting the dots between DE, research, MLE and MLOps:

Buy the LLM Engineerโ€™s Handbook

LLM Engineer's Handbook

About

๐Ÿค– ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป for ๐—ณ๐—ฟ๐—ฒ๐—ฒ how to ๐—ฏ๐˜‚๐—ถ๐—น๐—ฑ an end-to-end ๐—ฝ๐—ฟ๐—ผ๐—ฑ๐˜‚๐—ฐ๐˜๐—ถ๐—ผ๐—ป-๐—ฟ๐—ฒ๐—ฎ๐—ฑ๐˜† ๐—Ÿ๐—Ÿ๐—  & ๐—ฅ๐—”๐—š ๐˜€๐˜†๐˜€๐˜๐—ฒ๐—บ using ๐—Ÿ๐—Ÿ๐— ๐—ข๐—ฝ๐˜€ best practices: ~ ๐˜ด๐˜ฐ๐˜ถ๐˜ณ๐˜ค๐˜ฆ ๐˜ค๐˜ฐ๐˜ฅ๐˜ฆ + 11 ๐˜ฉ๐˜ข๐˜ฏ๐˜ฅ๐˜ด-๐˜ฐ๐˜ฏ ๐˜ญ๐˜ฆ๐˜ด๐˜ด๐˜ฐ๐˜ฏ๐˜ด

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.4%
  • TypeScript 10.3%
  • Makefile 1.9%
  • Other 1.4%