Skip to content
View haozheji's full-sized avatar
:shipit:
:shipit:

Highlights

  • Pro

Block or report haozheji

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepEP: an efficient expert-parallel communication library

Cuda 7,479 723 Updated Apr 22, 2025

Minimal RLHF implementation built on top of minGPT.

Python 29 2 Updated Jul 4, 2024

Create Epic Math and Physics Animations From Text.

Python 931 103 Updated Mar 30, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 7,079 778 Updated Apr 24, 2025

This repository includes some detailed proofs of "Bias Variance Decomposition for KL Divergence".

4 Updated Sep 25, 2021

Scalable toolkit for efficient model alignment

Python 773 97 Updated Apr 23, 2025

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,725 267 Updated Dec 27, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)

Python 6,407 630 Updated Apr 24, 2025

Repository for "Generative Flow Networks as Entropy-Regularized RL" (AISTATS-2024, Oral)

Python 34 1 Updated Apr 21, 2024

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

Python 876 61 Updated Feb 16, 2025

An in-browser, local-first Markdown resume builder.

TypeScript 536 99 Updated Jul 11, 2024
Python 23 4 Updated Sep 24, 2024

A PowerPoint add-in to insert LaTeX equations into PowerPoint presentations on Windows and Mac

VBA 1,022 68 Updated Jan 30, 2025

A python Linear Programming API

Python 2,234 406 Updated Apr 16, 2025

Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍

Shell 22,880 3,420 Updated Apr 2, 2025

Puzzles for learning Triton

Jupyter Notebook 1,594 126 Updated Nov 18, 2024

Grok open release

Python 50,236 8,346 Updated Aug 30, 2024

Paper list about multimodal and large language models, only used to record papers I read in the daily arxiv for personal needs.

621 42 Updated Apr 24, 2025

ICLR2023 - Tailoring Language Generation Models under Total Variation Distance

Python 21 1 Updated Feb 8, 2023

Some preliminary explorations of Mamba's context scaling.

Python 212 10 Updated Feb 8, 2024

Easy TOC creation for GitHub README.md

Shell 3,264 2,742 Updated Oct 12, 2024

Example models using DeepSpeed

Python 6,458 1,087 Updated Apr 20, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 45,692 7,057 Updated Apr 24, 2025

Robust recipes to align language models with human and AI preferences

Python 5,143 442 Updated Nov 21, 2024

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Python 4,629 476 Updated Jan 8, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,538 207 Updated Aug 11, 2024

DSPy: The framework for programming—not prompting—language models

Python 23,706 1,825 Updated Apr 24, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

3,901 239 Updated Feb 19, 2025

Inference code for CodeLlama models

Python 16,278 1,909 Updated Aug 12, 2024

A retro game engine for Python

Rust 16,202 873 Updated Apr 17, 2025
Next