Skip to content
View yegcjs's full-sized avatar

Organizations

@nju-mips

Block or report yegcjs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An Open-source RL System from ByteDance Seed and Tsinghua AIR

1,087 46 Updated Apr 10, 2025

An extension of the nanoGPT repository for training small MOE models.

Python 120 15 Updated Mar 9, 2025

My learning notes/codes for ML SYS.

Python 1,769 108 Updated Apr 12, 2025

procedural reasoning datasets

Python 558 56 Updated Apr 12, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 6,526 693 Updated Apr 13, 2025

Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym

Jupyter Notebook 428 27 Updated Apr 2, 2025

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

4 Updated Dec 20, 2024

This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".

238 12 Updated Apr 11, 2025

科技爱好者周刊,每周五发布

54,058 3,178 Updated Apr 11, 2025

Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos

Python 1,437 146 Updated Jun 10, 2024

A brief and partial summary of RLHF algorithms.

127 3 Updated Mar 4, 2025

Scaling scaling laws with board games.

Python 48 8 Updated Jul 17, 2023

Recipes to train reward model for RLHF.

Python 1,283 93 Updated Feb 9, 2025

Textbook on reinforcement learning from human feedback

TeX 547 47 Updated Apr 12, 2025

Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"

Python 174 7 Updated Mar 6, 2025

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Python 712 48 Updated Sep 27, 2024

A curated list for awesome discrete diffusion models resources.

294 12 Updated Feb 5, 2025

Official Implemetation of DPLM (ICML'24) - Diffusion Language Models Are Versatile Protein Learners

C++ 146 13 Updated Mar 4, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 6,215 611 Updated Apr 13, 2025

A bibliography and survey of the papers surrounding o1

TeX 1,186 50 Updated Nov 16, 2024

The related works and background techniques about Openai o1

218 10 Updated Jan 7, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,642 365 Updated Apr 10, 2025

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 2,252 146 Updated Apr 13, 2025

An automatic paper generator

TeX 1,121 258 Updated Jan 9, 2022

The implement of ACL2024: "MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization"

Python 42 4 Updated Jun 15, 2024

[TMLR 2024] Efficient Large Language Models: A Survey

1,133 95 Updated Apr 1, 2025

The Open-Source Data Annotation Platform

TypeScript 778 74 Updated Feb 19, 2025

Data annotation toolbox supports image, audio and video data.

Python 1,146 116 Updated Apr 10, 2025
Next