yegcjs

2+c yegcjs

30 followers · 37 following

[email protected]

Achievements

Organizations

Stars

BytedTsinghua-SIA / DAPO

An Open-source RL System from ByteDance Seed and Tsinghua AIR

1,087 46 Updated Apr 10, 2025

wolfecameron / nanoMoE

Forked from karpathy/nanoGPT

An extension of the nanoGPT repository for training small MOE models.

Python 120 15 Updated Mar 9, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 1,769 108 Updated Apr 12, 2025

open-thought / reasoning-gym

procedural reasoning datasets

Python 558 56 Updated Apr 12, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 6,526 693 Updated Apr 13, 2025

gregorbachmann / Next-Token-Failures

Python 82 8 Updated Mar 12, 2024

SWE-Gym / SWE-Gym

Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym

Jupyter Notebook 428 27 Updated Apr 2, 2025

ZhiYuanZeng / o1-roadmap

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

4 Updated Dec 20, 2024

OS-Agent-Survey / OS-Agent-Survey

This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".

238 12 Updated Apr 11, 2025

ruanyf / weekly

科技爱好者周刊，每周五发布

54,058 3,178 Updated Apr 11, 2025

openai / Video-Pre-Training

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Python 1,437 146 Updated Jun 10, 2024

yihedeng9 / rlhf-summary-notes

A brief and partial summary of RLHF algorithms.

127 3 Updated Mar 4, 2025

llm-as-a-judge / Awesome-LLM-as-a-judge

307 15 Updated Apr 4, 2025

andyljones / boardlaw

Scaling scaling laws with board games.

Python 48 8 Updated Jul 17, 2023

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,283 93 Updated Feb 9, 2025

natolambert / rlhf-book

Textbook on reinforcement learning from human feedback

TeX 547 47 Updated Apr 12, 2025

princeton-nlp / ProLong

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 174 7 Updated Mar 6, 2025

jzhang38 / EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 712 48 Updated Sep 27, 2024

kuleshov-group / awesome-discrete-diffusion-models

A curated list for awesome discrete diffusion models resources.

294 12 Updated Feb 5, 2025

bytedance / dplm

Official Implemetation of DPLM (ICML'24) - Diffusion Language Models Are Versatile Protein Learners

C++ 146 13 Updated Mar 4, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 6,215 611 Updated Apr 13, 2025

srush / awesome-o1

A bibliography and survey of the papers surrounding o1

TeX 1,186 50 Updated Nov 16, 2024

wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey

The related works and background techniques about Openai o1

218 10 Updated Jan 7, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,642 365 Updated Apr 10, 2025

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 2,252 146 Updated Apr 13, 2025

strib / scigen

An automatic paper generator

TeX 1,121 258 Updated Jan 9, 2022

NJUNLP / MAPO

The implement of ACL2024: "MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization"

Python 42 4 Updated Jun 15, 2024

AIoT-MLSys-Lab / Efficient-LLMs-Survey

[TMLR 2024] Efficient Large Language Models: A Survey

1,133 95 Updated Apr 1, 2025

opendatalab / LabelLLM

The Open-Source Data Annotation Platform

TypeScript 778 74 Updated Feb 19, 2025

opendatalab / labelU

Data annotation toolbox supports image, audio and video data.

Python 1,146 116 Updated Apr 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2+c yegcjs

Achievements

Achievements

Organizations

Block or report yegcjs

Stars

BytedTsinghua-SIA / DAPO

wolfecameron / nanoMoE

zhaochenyang20 / Awesome-ML-SYS-Tutorial

open-thought / reasoning-gym

volcengine / verl

gregorbachmann / Next-Token-Failures

SWE-Gym / SWE-Gym

ZhiYuanZeng / o1-roadmap

OS-Agent-Survey / OS-Agent-Survey

ruanyf / weekly

openai / Video-Pre-Training

yihedeng9 / rlhf-summary-notes

llm-as-a-judge / Awesome-LLM-as-a-judge

andyljones / boardlaw

RLHFlow / RLHF-Reward-Modeling

natolambert / rlhf-book

princeton-nlp / ProLong

jzhang38 / EasyContext

kuleshov-group / awesome-discrete-diffusion-models

bytedance / dplm

OpenRLHF / OpenRLHF

srush / awesome-o1

wjn1996 / Awesome-LLM-Reasoning-Openai-o1-Survey

hijkzzz / Awesome-LLM-Strawberry

fla-org / flash-linear-attention

strib / scigen

NJUNLP / MAPO

AIoT-MLSys-Lab / Efficient-LLMs-Survey

opendatalab / LabelLLM

opendatalab / labelU