rxlqn

🏠

Working from home

Weiyuan Li rxlqn

🏠

Working from home

RL, LLM, Agent

9 followers · 18 following

Shanghai

Achievements

Lists (12)

Sort

Stars

thamsuppp / summary-eval-article

Code for Medium article How to Evaluate LLM Summarization

Jupyter Notebook 5 Updated Jan 20, 2025

openai / openai-agents-python

A lightweight, powerful framework for multi-agent workflows

Python 8,944 1,131 Updated Apr 18, 2025

punkpeye / awesome-mcp-servers

A collection of MCP servers.

39,719 2,832 Updated Apr 17, 2025

WLiK / LLM4Rec-Awesome-Papers

A list of awesome papers and resources of recommender system on large language model (LLM).

1,786 139 Updated Mar 17, 2025

modelscope / ms-swift

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen2.5, Llama4, InternLM3, GLM4, Mistral, Yi1.5, DeepSeek-R1, ...) and 200+ MLLMs (Qwen2.5-VL, Qwen2.5-Omni, Qwen2-Audio, Ovis2, InternVL3…

Python 7,025 599 Updated Apr 17, 2025

Azure / gpt-rag-agentic

Python 79 29 Updated Apr 10, 2025

zhentingqi / rStar

Python 920 104 Updated Jan 23, 2025

dhcode-cpp / X-R1

minimal-cost for training 0.5B R1-Zero

Python 699 88 Updated Mar 28, 2025

jadolg / rocketchat_API

Python API wrapper for Rocket.Chat

Python 279 93 Updated Apr 16, 2025

mckaywrigley / chatbot-ui

AI chat for any model.

TypeScript 30,988 8,700 Updated Aug 3, 2024

RocketChat / Rocket.Chat

The communications platform that puts data protection first.

TypeScript 42,494 11,661 Updated Apr 19, 2025

openimsdk / open-im-server

IM Chat ChatGPT

Go 14,579 2,567 Updated Apr 17, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,469 259 Updated Apr 10, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 6,208 726 Updated Apr 4, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,594 1,468 Updated Apr 2, 2025

Unakar / Logic-RL

Reproduce R1 Zero on Logic Puzzle

Python 2,303 153 Updated Mar 20, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,014 2,196 Updated Apr 18, 2025

doocs / leetcode

🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer（第 2 版）》、《程序员面试金典（第 6 版）》题解

Java 33,678 8,992 Updated Apr 18, 2025

unclecode / crawl4ai

🚀🤖 Crawl4AI: Open-source LLM Friendly Web Crawler & Scraper. Don't be shy, join here: https://discord.gg/jP8KfhDhyN

Python 39,936 3,582 Updated Apr 18, 2025

rxlqn / Awesome-LLM-RL

This collection aims to present the ‘cherry on the cake’ of recent AI advancements in the realm of LLMs and RL.

1 Updated Jan 19, 2025

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,682 369 Updated Apr 16, 2025

huggingface / evaluation-guidebook

Sharing both practical insights and theoretical knowledge about LLM evaluation that we gathered while managing the Open LLM Leaderboard and designing lighteval!

Jupyter Notebook 1,140 72 Updated Jan 7, 2025

mnapoli / aoe2-de-macos

How to run Age of Empires 2 DE on macOS

111 6 Updated Aug 17, 2023

langchain-ai / memory-template

Python 112 22 Updated Dec 10, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 6,326 622 Updated Apr 19, 2025

opendilab / LightZero

[NeurIPS 2023 Spotlight] LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios (awesome MCTS)

Python 1,339 151 Updated Apr 17, 2025

RLHFlow / RAFT

This is an official implementation of the Reward rAnked Fine-Tuning Algorithm (RAFT), also known as iterative best-of-n fine-tuning or rejection sampling fine-tuning.

Python 29 3 Updated Sep 22, 2024

TeamWiseFlow / wiseflow

Use LLMs to dig out what you care about from massive amounts of information and a variety of sources daily.

Python 7,330 1,325 Updated Apr 17, 2025

huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences

Python 5,134 440 Updated Nov 21, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 13,292 1,811 Updated Apr 18, 2025

Weiyuan Li rxlqn

Lists (12)

agent

CLIP

generation

IM

leetcode

LLM

Navigation

optimal control

rag

Reinforcement Learning

tools

xiaoice

Stars