Skip to content
View zhenghaoxu-gatech's full-sized avatar

Block or report zhenghaoxu-gatech

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Out-of-distribution detection, robustness, and generalization resources. The repository contains a curated list of papers, tutorials, books, videos, articles and open-source libraries etc

854 70 Updated Nov 21, 2024

Repo for or-bench

Python 6 Updated Jun 24, 2024

Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"

Jupyter Notebook 75 9 Updated Dec 28, 2023

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1,390 119 Updated Jun 13, 2024

NCAlgebra - Non Commutative Algebra Package for Mathematica

Mercury 159 24 Updated Nov 4, 2023

torch-optimizer -- collection of optimizers for Pytorch

Python 3,063 298 Updated Mar 22, 2024

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 8,558 1,232 Updated Nov 8, 2024

Robust recipes to align language models with human and AI preferences

Python 4,874 420 Updated Nov 21, 2024

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,322 827 Updated Jan 8, 2025

Recipes to train reward model for RLHF.

Python 1,121 76 Updated Dec 12, 2024

RewardBench: the first evaluation tool for reward models.

Python 481 56 Updated Dec 11, 2024

Train transformer language models with reinforcement learning.

Python 10,545 1,363 Updated Jan 8, 2025

Neural Tangent Kernel Papers

100 10 Updated Jan 8, 2025
Python 5 1 Updated May 24, 2023

Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)

Python 167 13 Updated Dec 16, 2023

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,070 698 Updated Dec 17, 2024

A simple way to calibrate your neural network.

Python 1,120 161 Updated Aug 24, 2021

differential topology

TeX 5 2 Updated Apr 20, 2017

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 6,015 682 Updated Nov 14, 2024
Batchfile 2 Updated Mar 20, 2021

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 18,653 1,945 Updated Apr 4, 2024
TeX 17 8 Updated Aug 7, 2024