Skip to content
View chhluo's full-sized avatar
  • Zhejiang University
  • Hangzhou, China

Block or report chhluo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Python 2,241 168 Updated Mar 28, 2025

Model Context Protocol Servers

JavaScript 40,474 4,407 Updated Apr 24, 2025

A course on aligning smol models.

Jupyter Notebook 5,761 2,022 Updated Jan 24, 2025

收集和梳理垂直领域的开源模型、数据集及评测基准。

2,486 200 Updated Dec 26, 2023

Solve Visual Understanding with Reinforced VLMs

Python 4,752 299 Updated Apr 21, 2025

An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…

Python 660 81 Updated Mar 13, 2025

Awesome Reasoning LLM Tutorial/Survey/Guide

Python 1,438 99 Updated Apr 11, 2025

Official repository of ’Visual-RFT: Visual Reinforcement Fine-Tuning’

Python 1,606 74 Updated Apr 18, 2025

[KDD 2024] Team up GBDTs and DNNs: Advancing Efficient and Effective Tabular Prediction with Tree-hybrid MLPs

Python 10 Updated Mar 3, 2025

s1: Simple test-time scaling

Python 6,300 741 Updated Apr 4, 2025

Scalable RL solution for advanced reasoning of language models

Python 1,505 91 Updated Mar 18, 2025

A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.

Python 896 58 Updated Mar 25, 2025

Witness the aha moment of VLM with less than $3.

Python 3,580 280 Updated Mar 1, 2025

Fully open reproduction of DeepSeek-R1

Python 24,118 2,214 Updated Apr 23, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,644 1,472 Updated Apr 2, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,698 369 Updated Apr 23, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,298 1,395 Updated Mar 3, 2025

An open source code repository of driving world models, with training, inferencing, evaluation tools, and pretrained checkpoints.

Python 228 35 Updated Apr 24, 2025

The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention

Python 2,544 191 Updated Apr 10, 2025

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 601 65 Updated Apr 12, 2025

[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

421 9 Updated Jan 17, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 6,207 536 Updated Apr 24, 2025

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 4,747 1,722 Updated Feb 26, 2025

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

2,204 100 Updated Jan 26, 2025

📖 This is a repository for organizing papers, codes and other resources related to unified multimodal models.

517 25 Updated Apr 9, 2025

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,389 261 Updated Apr 23, 2025
Shell 85 8 Updated Feb 20, 2025
Next