Stars
My learning notes/codes for ML SYS.
veRL: Volcano Engine Reinforcement Learning for LLM
Open Thoughts: Fully Open Data Curation for Thinking Models
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Finetune Llama 3.3, DeepSeek-R1 & Reasoning LLMs 2x faster with 70% less memory
Lightweight tool to identify Data Contamination in LLMs evaluation
Training and Benchmarking LLMs for Code Preference.
Pytorch implementation of Tree Preference Optimization (TPO)
[SafeGenAi @ NeurIPS 2024] Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates
Implementation of the MATRIX framework (ICML 2024)
Official code for the paper: WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents
Arena-Hard-Auto: An automatic LLM benchmark.
Command-line YAML, XML, TOML processor - jq wrapper for YAML/XML/TOML documents
Secrets of RLHF in Large Language Models Part I: PPO
PaL: Program-Aided Language Models (ICML 2023)
SGLang is a fast serving framework for large language models and vision language models.
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
Scrape from Twitter using Nitter instances
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
It is said that, Ilya Sutskever gave John Carmack this reading list of ~ 30 research papers on deep learning.
Together Mixture-Of-Agents (MoA) – 65.1% on AlpacaEval with OSS models