Lists (12)
Sort Name ascending (A-Z)
Stars
Code for Medium article How to Evaluate LLM Summarization
A lightweight, powerful framework for multi-agent workflows
A list of awesome papers and resources of recommender system on large language model (LLM).
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen2.5, Llama4, InternLM3, GLM4, Mistral, Yi1.5, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3…
The communications platform that puts data protection first.
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
Fully open reproduction of DeepSeek-R1
🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解
🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN
This collection aims to present the ‘cherry on the cake’ of recent AI advancements in the realm of LLMs and RL.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)
This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.
Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.
Robust recipes to align language models with human and AI preferences
Train transformer language models with reinforcement learning.