Stars
real time face swap and one-click video deepfake with only a single image
Instant voice cloning by MIT and MyShell.
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
A high-throughput and memory-efficient inference and serving engine for LLMs
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
State-of-the-art 2D and 3D Face Analysis Project
🦔 PostHog provides open-source product analytics, session recording, feature flagging and A/B testing that you can self-host.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
A modular graph-based Retrieval-Augmented Generation (RAG) system
Faker is a Python package that generates fake data for you.
DSPy: The framework for programming—not prompting—foundation models
State-of-the-Art Text Embeddings
An open-source RAG-based tool for chatting with your documents.
Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable,…
Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
Community maintained fork of pdfminer - we fathom PDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Automatically visualize your pandas dataframe via a single print! 📊 💡
🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
The RedPajama-Data repository contains code for preparing large datasets for training large language models.