Stars
✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows
This is InfiniRetri, a tool enhance Transformer-based LLMs(Large Language Model) ablity to hangle Long-Context.
Prepare for DeekSeek R1 inference: Benchmark CPU, DRAM, SSD, iGPU, GPU, ... with efficient code.
SGLang is a fast serving framework for large language models and vision language models.
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
Fully open reproduction of DeepSeek-R1
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Visualizer for neural network, deep learning and machine learning models
🔒 Enterprise-grade API gateway that helps you monitor and impose cost or rate limits per API key. Get fine-grained access control and monitoring per user, application, or environment. Supports Open…
🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
保存微信历史版本
continous batching and parallel acceleration for RWKV6
Rudimentary support for using multiple GPUs in a ComfyUI workflow
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API
Streaming ASR and TTS based on FastAPI+ sherpa-onnx
ChatGPT web application. ChatGPT 网页应用,支持多对话、海量提示词、PWA、ASR、TTS
实时语音交互数字人,支持端到端语音方案(GLM-4-Voice - THG)和级联方案(ASR-LLM-TTS-THG)。可自定义形象与音色,无须训练,支持音色克隆,首包延迟低至3s。Real-time voice interactive digital human, supporting end-to-end voice solutions (GLM-4-Voice - THG) and …
百聆 是一个类似GPT-4o的语音对话机器人,通过ASR+LLM+TTS实现,集成DeepSeek R1等优秀大模型,时延低至800ms,Mac等低配置也可运行,支持打断