Skip to content
View Jackory's full-sized avatar

Block or report Jackory

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 22 Updated Apr 1, 2025

Understanding R1-Zero-Like Training: A Critical Perspective

Python 863 40 Updated Apr 15, 2025

Research Code for preprint "Optimizing Test-Time Compute via Meta Reinforcement Finetuning".

92 3 Updated Mar 12, 2025

[Support 0.48.x](Reset Cursor AI MachineID & Bypass Higher Token Limit) Cursor Ai ,自动重置机器ID , 免费升级使用Pro功能: You've reached your trial request limit. / Too many free trial accounts used on this machi…

Python 19,263 2,354 Updated Apr 17, 2025

A lightweight, powerful framework for multi-agent workflows

Python 8,974 1,137 Updated Apr 18, 2025

Recipes to train the self-rewarding reasoning LLMs.

Python 212 9 Updated Mar 2, 2025

Democratizing Reinforcement Learning for LLMs

Jupyter Notebook 3,072 278 Updated Apr 11, 2025

This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"

Python 59 2 Updated Mar 13, 2025

Muon is Scalable for LLM Training

1,022 42 Updated Mar 28, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,680 269 Updated Apr 14, 2025

Official Repo for Open-Reasoner-Zero

Python 1,872 96 Updated Apr 8, 2025

Paper Reproduction Google SCoRE(Training Language Models to Self-Correct via Reinforcement Learning)

Jupyter Notebook 139 23 Updated Sep 21, 2024
Python 630 25 Updated Mar 31, 2025

MR.Q is a general-purpose model-free reinforcement learning algorithm.

Python 88 4 Updated Apr 9, 2025

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Python 171 6 Updated Mar 20, 2025

Extreme Q-Learning: Max Entropy RL without Entropy

Python 86 10 Updated Feb 14, 2023

verl: Volcano Engine Reinforcement Learning for LLMs

Python 6,839 740 Updated Apr 19, 2025

Awesome lists about framework figures in papers

744 17 Updated Mar 26, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,595 1,468 Updated Apr 2, 2025

My learning notes/codes for ML SYS.

Python 1,850 116 Updated Apr 18, 2025

Simple RL training for reasoning

Python 3,472 260 Updated Apr 10, 2025

Fully open reproduction of DeepSeek-R1

Python 24,020 2,196 Updated Apr 18, 2025

NeurIPS 2024 tutorial on LLM Inference

Jupyter Notebook 41 3 Updated Dec 10, 2024

Make websites accessible for AI agents

Python 56,749 6,086 Updated Apr 18, 2025

Repository for the paper Stream of Search: Learning to Search in Language

Python 144 22 Updated Feb 3, 2025

Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai

TypeScript 2,965 203 Updated Apr 19, 2025

Automating the Search for Artificial Life with Foundation Models!

Jupyter Notebook 406 45 Updated Jan 12, 2025

Clean single-file implementation of offline RL algorithms in JAX

Python 141 2 Updated Dec 24, 2024

Scalable RL solution for advanced reasoning of language models

Python 1,488 91 Updated Mar 18, 2025

Recipes to scale inference-time compute of open models

Python 1,055 110 Updated Feb 25, 2025
Next