Stars
CAD-Recode: Reverse Engineering CAD Code from Point Clouds
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
NVIDIA Isaac GR00T N1 is the world's first open foundation model for generalized humanoid robot reasoning and skills.
[CVPR 2025] Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
Generate a video script, voice and a talking face completely with AI
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…
CUDA accelerated rasterization of gaussian splatting
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
An open-source library for GPU-accelerated robot learning and sim-to-real transfer.
[CVPR 2025] The First Investigation of CoT Reasoning in Image Generation
Sky-T1: Train your own O1 preview model within $450
The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective…
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).
Finetune Llama 4, DeepSeek-R1, Gemma 3 & Reasoning LLMs 2x faster with 70% less memory! 🦥
HunyuanVideo: A Systematic Framework For Large Video Generation Model
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
[ICLR'25] SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Letta (formerly MemGPT) is the stateful agents framework with memory, reasoning, and context management.
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer