Skip to content
View DRSY's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report DRSY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 240 7 Updated Feb 19, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 957 44 Updated Feb 19, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 10,841 1,060 Updated Feb 16, 2025

A jounery to real multimodel R1 ! We are doing on large-scale experiment

Python 197 1 Updated Feb 12, 2025

A generative speech model for daily dialogue.

Python 34,541 3,726 Updated Feb 18, 2025

RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.

Python 38 6 Updated Feb 19, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,861 481 Updated Feb 20, 2025

A fork to add multimodal model training to open-r1

Python 779 43 Updated Feb 8, 2025

Fully open reproduction of DeepSeek-R1

Python 20,841 1,819 Updated Feb 20, 2025
Python 2,183 151 Updated Feb 20, 2025

An Approach to Enhancing the Efficacy of Post-Training Using Synthetic Data by Iterative Data Selection

Python 5 Updated Dec 24, 2024

A series of technical report on Slow Thinking with LLM

Python 410 21 Updated Feb 12, 2025
1 Updated Feb 20, 2025
Python 1,332 49 Updated Nov 21, 2024

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 886 115 Updated Jan 4, 2025

LLM KV cache compression made easy

Python 397 26 Updated Feb 18, 2025

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,077 72 Updated Jan 23, 2025

Optimizing inference proxy for LLMs

Python 2,045 159 Updated Feb 16, 2025

OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

Python 281 13 Updated Feb 20, 2025

Efficient Triton Kernels for LLM Training

Python 4,449 270 Updated Feb 20, 2025
2 Updated Oct 25, 2024

Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen

373 22 Updated Dec 18, 2024

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 6,890 586 Updated Feb 29, 2024

📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024

Jupyter Notebook 37 4 Updated Oct 15, 2024

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,678 486 Updated Feb 20, 2025

Extensible, parallel implementations of t-SNE

Python 1,496 168 Updated Oct 24, 2024

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

243 13 Updated Jan 25, 2025
Next
Showing results