Stars
People choose popular projects, often not because it applies to their problems
A course on aligning smol models.
Build datasets using natural language
A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications.
Curated list of datasets and tools for post-training.
The easiest way to use Agentic RAG in any enterprise
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
DSPy: The framework for programming—not prompting—language models
🐢 Open-Source Evaluation & Testing for AI & LLM systems
An AI-powered Personal Identifiable Information (PII) scanner.
Supercharge Your LLM Application Evaluations 🚀
📚 Parameterize, execute, and analyze notebooks
Rich is a Python library for rich text and beautiful formatting in the terminal.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
[IJCAI 2024] Generate different roles for GPTs to form a collaborative entity for complex tasks.
FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets