Stars
An Open-source RL System from ByteDance Seed and Tsinghua AIR
wolfecameron / nanoMoE
Forked from karpathy/nanoGPTAn extension of the nanoGPT repository for training small MOE models.
My learning notes/codes for ML SYS.
verl: Volcano Engine Reinforcement Learning for LLMs
Code for Paper: Training Software Engineering Agents and Verifiers with SWE-Gym
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".
Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
A brief and partial summary of RLHF algorithms.
Recipes to train reward model for RLHF.
Textbook on reinforcement learning from human feedback
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
A curated list for awesome discrete diffusion models resources.
Official Implemetation of DPLM (ICML'24) - Diffusion Language Models Are Versatile Protein Learners
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
A bibliography and survey of the papers surrounding o1
The related works and background techniques about Openai o1
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
The implement of ACL2024: "MAPO: Advancing Multilingual Reasoning through Multilingual Alignment-as-Preference Optimization"
[TMLR 2024] Efficient Large Language Models: A Survey
The Open-Source Data Annotation Platform
Data annotation toolbox supports image, audio and video data.