Skip to content
View ZhaoyangLiu-Leo's full-sized avatar
  • Alibaba
  • Shanghai

Organizations

@GAN-Challenger

Block or report ZhaoyangLiu-Leo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

The official Python SDK for Model Context Protocol servers and clients

Python 3,458 323 Updated Mar 12, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 20,151 1,637 Updated Mar 13, 2025

R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Python 270 18 Updated Mar 10, 2025

Official Repo for Open-Reasoner-Zero

Python 1,585 74 Updated Mar 5, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,667 446 Updated Mar 13, 2025

Super-Efficient RLHF Training of LLMs with Parameter Reallocation

Python 236 14 Updated Jan 13, 2025

Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling

89 Updated Jan 23, 2025

Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel AI SDK! Search with models like Grok 2.0.

TypeScript 7,269 851 Updated Mar 8, 2025

全网乱传的Deepseek从入门到精通的PDF版本,清华大学新闻与传播学院 新媒体研究中心 元宇宙文化实验室

348 134 Updated Feb 14, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

3,795 232 Updated Feb 19, 2025

Aligning Large Language Models with Human: A Survey

724 32 Updated Sep 11, 2023

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)

Python 2,474 123 Updated Mar 13, 2024

LLM training in simple, raw C/CUDA

Cuda 26,005 2,982 Updated Oct 2, 2024

AllenAI's post-training codebase

Python 2,789 358 Updated Mar 12, 2025

Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)

TypeScript 3,392 314 Updated Mar 13, 2025

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,141 234 Updated Feb 19, 2025

Fully open reproduction of DeepSeek-R1

Python 22,693 2,038 Updated Mar 13, 2025

Generative Judge for Evaluating Alignment

Python 230 15 Updated Jan 18, 2024

RewardBench: the first evaluation tool for reward models.

Python 521 61 Updated Feb 27, 2025

The open-source visual AI programming environment and TypeScript library

TypeScript 3,581 296 Updated Mar 10, 2025
Python 2,331 165 Updated Mar 6, 2025
Python 308 16 Updated Jun 24, 2024

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,140 1,416 Updated Mar 10, 2025

pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidb.ai

TypeScript 2,440 137 Updated Mar 12, 2025

Automatic prompt optimization framework for multi-step agent tasks.

PDDL 28 2 Updated Nov 12, 2024

KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…

Python 5,949 398 Updated Mar 13, 2025

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 4,195 377 Updated Jan 27, 2025

Search-o1: Agentic Search-Enhanced Large Reasoning Models

Python 702 77 Updated Mar 4, 2025
Next