Jackory

Yuhua Jiang Jackory

Nurture talents in obscurity.

35 followers · 42 following

Tsinghua University
Beijing
19:17 (UTC +08:00)
https://jackory.github.io/

Achievements

Lists (6)

Sort

Classic

3 repositories

🔮 Future ideas

Stars

NJUNLP / R-PRM

Python 22 Updated Apr 1, 2025

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 863 40 Updated Apr 15, 2025

CMU-AIRe / MRT

Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".

92 3 Updated Mar 12, 2025

yeongpin / cursor-free-vip

[Support 0.48.x]（Reset Cursor AI MachineID & Bypass Higher Token Limit） Cursor Ai ，自动重置机器ID ，免费升级使用Pro功能: You've reached your trial request limit. / Too many free trial accounts used on this machi…

Python 19,263 2,354 Updated Apr 17, 2025

openai / openai-agents-python

A lightweight, powerful framework for multi-agent workflows

Python 8,974 1,137 Updated Apr 18, 2025

RLHFlow / Self-rewarding-reasoning-LLM

Recipes to train the self-rewarding reasoning LLMs.

Python 212 9 Updated Mar 2, 2025

agentica-project / rllm

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,072 278 Updated Apr 11, 2025

NineAbyss / S2R

This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"

Python 59 2 Updated Mar 13, 2025

MoonshotAI / Moonlight

Muon is Scalable for LLM Training

1,022 42 Updated Mar 28, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,680 269 Updated Apr 14, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,872 96 Updated Apr 8, 2025

daje0601 / Google_SCoRe

Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)

Jupyter Notebook 139 23 Updated Sep 21, 2024

huggingface / Math-Verify

Python 630 25 Updated Mar 31, 2025

facebookresearch / MRQ

MR.Q is a general-purpose model-free reinforcement learning algorithm.

Python 88 4 Updated Apr 9, 2025

InternLM / OREAL

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Python 171 6 Updated Mar 20, 2025

Div99 / XQL

Extreme Q-Learning: Max Entropy RL without Entropy

Python 86 10 Updated Feb 14, 2023

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 6,839 740 Updated Apr 19, 2025

LongHZ140516 / awesome-framework-gallery

Awesome lists about framework figures in papers

744 17 Updated Mar 26, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,595 1,468 Updated Apr 2, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 1,850 116 Updated Apr 18, 2025

hkust-nlp / simpleRL-reason

Simple RL training for reasoning

Python 3,472 260 Updated Apr 10, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 24,020 2,196 Updated Apr 18, 2025

cmu-l3 / neurips2024-inference-tutorial-code

NeurIPS 2024 tutorial on LLM Inference

Jupyter Notebook 41 3 Updated Dec 10, 2024

browser-use / browser-use

Make websites accessible for AI agents

Python 56,749 6,086 Updated Apr 18, 2025

kanishkg / stream-of-search

Repository for the paper Stream of Search: Learning to Search in Language

Python 144 22 Updated Feb 3, 2025

FellouAI / eko

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

TypeScript 2,965 203 Updated Apr 19, 2025

SakanaAI / asal

Automating the Search for Artificial Life with Foundation Models!

Jupyter Notebook 406 45 Updated Jan 12, 2025

nissymori / JAX-CORL

Clean single-file implementation of offline RL algorithms in JAX

Python 141 2 Updated Dec 24, 2024

PRIME-RL / PRIME

Scalable RL solution for advanced reasoning of language models

Python 1,488 91 Updated Mar 18, 2025

huggingface / search-and-learn

Recipes to scale inference-time compute of open models

Python 1,055 110 Updated Feb 25, 2025

Yuhua Jiang Jackory

Lists (6)

Classic

🔮 Future ideas

Interesting

Papers

RepowithPaper

RL repo

Stars