Stars
This is the official implementation of our paper "QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension"
Official Repo for Open-Reasoner-Zero
Solve Visual Understanding with Reinforced VLMs
✨✨Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuracy
The Next Step Forward in Multimodal LLM Alignment
Train your grpo with zero dataset and low resources, 8bit/4bit/lora/qlora supported, multi-gpu supported ...
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory! 🦥
A fork to add multimodal model training to open-r1
A jounery to real multimodel R1 ! We are doing on large-scale experiment
Repo for paper "T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs"
MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models
Let your Claude able to think
[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation
An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)
32 projects in the framework of Deep Reinforcement Learning algorithms: Q-learning, DQN, PPO, DDPG, TD3, SAC, A2C and others. Each project is provided with a detailed training log.
Paper collections of multi-modal LLM for Math/STEM/Code.
An Open Large Reasoning Model for Real-World Solutions
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
The official implement of VITA, VITA15 and LongVITA.