Stars
Official repo for "GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation"
Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
A fine-tuned model from Qwen2.5-1.5B-Instruct, capable of handling sensitive topics. / 从 Qwen2.5-1.5B-Instruct 微调,主要擅长处理色情话题
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Distributed Triton for Parallel Systems
Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
Agentless🐱: an agentless approach to automatically solve software development problems
✨First Open-Source R1-like Video-LLM [2025/02/18]
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
TokenBridge: Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation. https://yuqingwang1029.github.io/TokenBridge
Train your AI self, amplify you, bridge the world
NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
Latest Advances on System-2 Reasoning
TxAgent: An AI Agent for Therapeutic Reasoning Across a Universe of Tools
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
No fortress, purely open ground. OpenManus is Coming.
The first Large Audio Language Model that enables native in-depth thinking, which is trained on large-scale audio Chain-of-Thought data.
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
Ola: Pushing the Frontiers of Omni-Modal Language Model