Lists (19)
Sort Name ascending (A-Z)
Stars
Witness the aha moment of VLM with less than $3.
Fully open reproduction of DeepSeek-R1
A curated list of papers for generalist agents
tracking papers, datasets, and models of "large language model (LLM) for time series"
Official repository for our work on micro-budget training of large-scale diffusion models.
[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Minimalistic 4D-parallelism distributed training framework for education purpose
Official PyTorch Implementation of Learning Affordance Grounding from Exocentric Images, CVPR 2022
[ICCV 2023] Understanding 3D Object Interaction from a Single Image
Code for "AffordanceLLM: Grounding Affordance from Vision Language Models"
MOKA: Open-World Robotic Manipulation through Mark-based Visual Prompting (RSS 2024)
A suite of image and video neural tokenizers
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
An Invitation to 3D Vision: A Tutorial for Everyone
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
Softcopy of Engineering Books. If you want a book{s} to be taken down, please contact me.
Here we will track the latest AI Multimodal Models, including Multimodal Foundation Models, LLM, Agent, Audio, Image, Video, Music and 3D content. 🔥
Clean PyTorch implementations of imitation and reward learning algorithms
A curated list of Diffusion Model in RL resources (continually updated)
Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning
Recipes to train reward model for RLHF.