Stars
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
A jounery to real multimodel R1 ! We are doing on large-scale experiment
Witness the aha moment of VLM with less than $3.
Frontier Multimodal Foundation Models for Image and Video Understanding
Janus-Series: Unified Multimodal Understanding and Generation Models
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
Token level visualization tools for large language models
veRL: Volcano Engine Reinforcement Learning for LLM
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including CUDA, x86 and ARMv9.
Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment
Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks
[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…
Code for ChatRex: Taming Multimodal LLM for Joint Perception and Understanding
The implementation of the paper 'Advancing Fine-Grained Visual Understanding with Multi-Granularity Alignment in Multi-Modal Models'
[CVPR 2024] Official PyTorch Code for "PromptKD: Unsupervised Prompt Distillation for Vision-Language Models"
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
Reading notes about Multimodal Large Language Models, Large Language Models, and Diffusion Models
Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities (ICML 2024)
Reasoning in LLMs: Papers and Resources, including Chain-of-Thought, OpenAI o1, and DeepSeek-R1 🍓