Stars
🌳 Web Development in Pure Python with Type-Guided Components.
Langflow is a powerful tool for building and deploying AI-powered agents and workflows.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
The official home of the Presto distributed SQL query engine for big data
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
An open-source runtime for composable workflows. Great for AI agents and CI/CD.
Robyn is a Super Fast Async Python Web Framework with a Rust runtime.
Asynchronous HTTP client/server framework for asyncio and Python
"MiniRAG: Making RAG Simpler with Small and Free Language Models"
PyScript is an open source platform for Python in the browser. Try PyScript: https://pyscript.com Examples: https://tinyurl.com/pyscript-examples Community: https://discord.gg/HxvBtukrg2
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
⚙️🦀 Build portable, modular & lightweight Fullstack Agents
Spacedrive is an open source cross-platform file explorer, powered by a virtual distributed filesystem written in Rust.
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deploym…
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A high-throughput and memory-efficient inference and serving engine for LLMs
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
Opiniated RAG for integrating GenAI in your apps 🧠 Focus on your product rather than the RAG. Easy integration in existing products with customisation! Any LLM: GPT4, Groq, Llama. Any Vectorstore: …
Rust / Wasm framework for creating reliable and efficient web applications
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens.
整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。
Hackable and optimized Transformers building blocks, supporting a composable construction.
Finetune Llama 3.3, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Fast and memory-efficient exact attention