-
Alibaba
- Shanghai
Starred repositories
The official Python SDK for Model Context Protocol servers and clients
A simple screen parsing tool towards pure vision based GUI agent
R1-searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Official Repo for Open-Reasoner-Zero
verl: Volcano Engine Reinforcement Learning for LLMs
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling
Scira (Formerly MiniPerplx) is a minimalistic AI-powered search engine that helps you find information on the internet. Powered by Vercel AI SDK! Search with models like Grok 2.0.
全网乱传的Deepseek从入门到精通的PDF版本,清华大学新闻与传播学院 新媒体研究中心 元宇宙文化实验室
A curated list of reinforcement learning with human feedback resources (continually updated)
Aligning Large Language Models with Human: A Survey
Large-scale, Informative, and Diverse Multi-round Chat Data (and Models)
Keep searching, reading webpages, reasoning until it finds the answer (or exceeding the token budget)
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Fully open reproduction of DeepSeek-R1
RewardBench: the first evaluation tool for reward models.
The open-source visual AI programming environment and TypeScript library
Clean, minimal, accessible reproduction of DeepSeek R1-Zero
pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidb.ai
Automatic prompt optimization framework for multi-step agent tasks.
KAG is a logical form-guided reasoning and retrieval framework based on OpenSPG engine and LLMs. It is used to build logical reasoning and factual Q&A solutions for professional domain knowledge ba…
g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains
Search-o1: Agentic Search-Enhanced Large Reasoning Models