deter3

Follow

deter3

Follow

8 followers · 6 following

Highlights

Developer Program Member

Lists (3)

Sort

ios mistral

mistral windows

windows mistral

Starred repositories

ByteDance-Seed / Seed-Thinking-v1.5

682 9 Updated Apr 17, 2025

tuber0613 / hot_news_daily_push

这是一个自动收集各大平台热点新闻（更关注 AI热点）、RSS订阅源以及特定Twitter Feed，进行处理、去重、总结，并通过多种渠道推送热点摘要的工具。该项目完全由Cursor和Trae接力编写

Python 43 7 Updated Apr 3, 2025

OpenPipe / ART

OpenPipe ART (Agent Reinforcement Trainer): train LLM agents

Python 91 1 Updated Apr 18, 2025

jruizgit / rules

Durable Rules Engine

JavaScript 1,202 212 Updated Mar 6, 2024

lzhxmu / CPPO

CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Models

Python 110 5 Updated Apr 18, 2025

UCSC-VLAA / VLAA-Thinking

SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models

Python 81 Updated Apr 17, 2025

hao-ai-lab / ReFoRCE

A Text-to-SQL Agent with Self-Refinement, Format Restriction, and Column Exploration

Python 19 3 Updated Apr 11, 2025

brendanhogan / DeepSeekRL-Extended

Exploring Applications of GRPO

Python 180 18 Updated Apr 11, 2025

agno-agi / agno

Agno is a lightweight library for building Agents with memory, knowledge, tools and reasoning.

Python 25,257 3,207 Updated Apr 18, 2025

amaydle / experiments

A playground for code experiments, snippets, and small-scale projects in one organized repository.

Jupyter Notebook 2 1 Updated Mar 23, 2025

terrierteam / pyterrier_colbert

Jupyter Notebook 86 35 Updated Apr 3, 2025

abehou / CLERC

Code repo for CLERC: A Legal Precedent Dataset for Case Retrieval and Retrieval-Augmented Analysis Generation (NAACL 2025)

Python 11 1 Updated Jan 28, 2025

PatentsView / PatentsView-API

HTML 17 11 Updated Mar 28, 2024

Knowledgator / GLiClass

Generalist and Lightweight Model for Text Classification

Python 119 11 Updated Apr 11, 2025

yai333 / Text-to-SQL-GRPO-Fine-tuning-Pipeline

This repository contains a pipeline for fine-tuning Large Language Models (LLMs) for Text-to-SQL conversion using General Reward Proximal Optimization (GRPO).

Python 14 5 Updated Apr 16, 2025

axolotl-ai-cloud / grpo_code

A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.

Python 22 4 Updated Apr 4, 2025

Open-Reasoner-Zero / Open-Reasoner-Zero

Official Repo for Open-Reasoner-Zero

Python 1,870 96 Updated Apr 8, 2025

foreveryh / mentis

Mentis: A powerful multi-agent orchestration framework built on LangGraph.

Python 208 18 Updated Apr 16, 2025

WooooDyy / AgentGym

Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.

Python 448 56 Updated Mar 11, 2025

OpenManus / OpenManus-RL

A live stream development of RL tunning for LLM agents

Python 2,434 324 Updated Apr 18, 2025

Tufalabs / BeyondNextTokenPrediction

Python 16 2 Updated Mar 24, 2025

facebookresearch / sweet_rl

Benchmark and research code for the paper SWEET-RL Training Multi-Turn LLM Agents onCollaborative Reasoning Tasks

Python 177 10 Updated Apr 13, 2025

Legionof7 / GRPOdx

Python 1 Updated Mar 15, 2025

marimo-team / marimo

A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.

Python 12,496 494 Updated Apr 18, 2025

ViennaRNA / ViennaRNA

The ViennaRNA Package

C 336 80 Updated Oct 20, 2024

ZongqianLi / ReasonGraph

Repository for the demo and paper: ReasonGraph: Visualisation of Reasoning Paths

HTML 457 38 Updated Mar 27, 2025

kuzudb / baml-kuzu-demo

Demo of knowledge graph creation and Graph RAG with BAML and Kuzu

Python 25 3 Updated Mar 13, 2025

faridani / Pymetheus

Dataset of Python Codes for Reinforcement Learning

Python 2 Updated Mar 8, 2025

SakanaAI / AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 10,754 1,554 Updated Apr 16, 2025

Mr-Jack-Tung / AGDPO-Algorithm-Thesis

2 Updated Mar 15, 2025

Starred topics

mixed-integer-programming