Starred repositories
This is the official release for the paper "EFM3D A Benchmark for Measuring Progress Towards 3D Egocentric Foundation Models" (https//arxiv.org/abs/2406.10224).
[3DV2025] SPAFormer: Sequential 3D Part Assembly with Transformers
Official implementation of "SUGAR: Pre-training 3D Visual Representations for Robotics" (CVPR'24).
Official Code for "SMPLer-X: Scaling Up Expressive Human Pose and Shape Estimation"
Bimanual Dexterous Teleoperation with Real-Time Retargeting using VisionPro
Paper repo for publication: "Steve-Eye: Equiping LLM-based Embodied Agents with Visual Perception in Open Worlds".
Mobile manipulation research tools for roboticists
Repository to identify Lego bricks automatically only using images
Vector (and Scalar) Quantization, in Pytorch
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web"
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
A high-throughput and memory-efficient inference and serving engine for LLMs
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
3D LEGO models and mosaics from images using R and #tidyverse
A modular RL library to fine-tune language models to human preferences
Making large AI models cheaper, faster and more accessible
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.