Stars
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
A Vision-Language Model for Spatial Affordance Prediction in Robotics
PyTorch implementation of MAE https//arxiv.org/abs/2111.06377
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
Align Anything: Training All-modality Model with Feedback
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
A generative world for general-purpose robotics & embodied AI learning.
Mastering Diverse Domains through World Models
🚀LLaMA-MoE v2: Exploring Sparsity of LLaMA from Perspective of Mixture-of-Experts with Post-Training
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Implementation of rectified flow and some of its followup research / improvements in Pytorch
Let your Claude able to think
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
Qwen2.5-Coder is the code version of Qwen2.5, the large language model series developed by Qwen team, Alibaba Cloud.
The Open Cookbook for Top-Tier Code Large Language Model
A high-throughput and memory-efficient inference and serving engine for LLMs
Elucidating The Design Space of Classifier-Guided Diffusion Generation
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
A community-maintained Python framework for creating mathematical animations.
A curated list of awesome works related to high dimensional structure/vector search & database
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
Robot Utility Models are trained on a diverse set of environments and objects, and then can be deployed in novel environments with novel objects without any further data or training.
the AI-native open-source embedding database
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 2, and other large language models.