DRSY

Follow

🎯

Focusing

任思宇 DRSY

🎯

Focusing

Follow

Efficient Methods for NLP/LLMs

61 followers · 39 following

Shanghai Jiao Tong University
Shanghai, China
01:59 - 12h behind
https://drsy.github.io/

Achievements

Achievements

Lists (3)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

Stars

Ola-Omni / Ola

Ola: Pushing the Frontiers of Omni-Modal Language Model

Python 240 7 Updated Feb 19, 2025

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 957 44 Updated Feb 19, 2025

FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 10,841 1,060 Updated Feb 16, 2025

FanqingM / R1-Multimodal-Journey

A jounery to real multimodel R1 ! We are doing on large-scale experiment

Python 197 1 Updated Feb 12, 2025

2noise / ChatTTS

A generative speech model for daily dialogue.

Python 34,541 3,726 Updated Feb 18, 2025

jackfsuia / nanoRLHF

RLHF experiments on a single A100 40G GPU. Support PPO, GRPO, REINFORCE, RAFT, RLOO, ReMax, DeepSeek R1-Zero reproducing.

Python 38 6 Updated Feb 19, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,861 481 Updated Feb 20, 2025

EvolvingLMMs-Lab / open-r1-multimodal

A fork to add multimodal model training to open-r1

Python 779 43 Updated Feb 8, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 20,841 1,819 Updated Feb 20, 2025

MoonshotAI / Kimi-k1.5

3,003 177 Updated Feb 2, 2025

deepseek-ai / DeepSeek-R1

78,875 10,200 Updated Feb 18, 2025

MiniMax-AI / MiniMax-01

Python 2,183 151 Updated Feb 20, 2025

deepseek-ai / DeepSeek-V3

Python 86,725 13,968 Updated Feb 18, 2025

JiaQiSJTU / IterIT

An Approach to Enhancing the Efficacy of Post-Training Using Synthetic Data by Iterative Data Selection

Python 5 Updated Dec 24, 2024

RUCAIBox / Slow_Thinking_with_LLMs

A series of technical report on Slow Thinking with LLM

Python 410 21 Updated Feb 12, 2025

xiujiesong / ISA

1 Updated Feb 20, 2025

Open-Source-O1 / Open-O1

Python 1,332 49 Updated Nov 21, 2024

Zefan-Cai / KVCache-Factory

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 886 115 Updated Jan 4, 2025

NVIDIA / kvpress

LLM KV cache compression made easy

Python 397 26 Updated Feb 18, 2025

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 1,077 72 Updated Jan 23, 2025

codelion / optillm

Optimizing inference proxy for LLMs

Python 2,045 159 Updated Feb 16, 2025

OS-Copilot / OS-Atlas

OS-ATLAS: A Foundation Action Model For Generalist GUI Agents

Python 281 13 Updated Feb 20, 2025

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 4,449 270 Updated Feb 20, 2025

DRSY / AbstractSoul

2 Updated Oct 25, 2024

facebookresearch / MovieGenBench

Movie Gen Bench - two media generation evaluation benchmarks released with Meta Movie Gen

373 22 Updated Dec 18, 2024

geekyutao / Inpaint-Anything

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 6,890 586 Updated Feb 29, 2024

davendw49 / llm_training_full_stack

📖 Full Stack Practice of the Large Language Model Training @ RLChina 2024

Jupyter Notebook 37 4 Updated Oct 15, 2024

modelscope / ms-swift

Use PEFT or Full-parameter to finetune 450+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek-R1, ...) and 150+ MLLMs (Qwen2.5-VL, Qwen2-Audio, Llama3.2-Vision, Llava, I…

Python 5,678 486 Updated Feb 20, 2025

pavlin-policar / openTSNE

Extensible, parallel implementations of t-SNE

Python 1,496 168 Updated Oct 24, 2024

ziqihuangg / Awesome-Evaluation-of-Visual-Generation

A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems

243 13 Updated Jan 25, 2025