zhenghaoxu-gatech

zhenghaoxu-gatech

Stars

huytransformer / Awesome-Out-Of-Distribution-Detection

Out-of-distribution detection, robustness, and generalization resources. The repository contains a curated list of papers, tutorials, books, videos, articles and open-source libraries etc

854 70 Updated Nov 21, 2024

justincui03 / or-bench

Repo for or-bench

Python 6 Updated Jun 24, 2024

paul-rottger / exaggerated-safety

Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"

Jupyter Notebook 75 9 Updated Dec 28, 2023

PKU-Alignment / safe-rlhf

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Python 1,390 119 Updated Jun 13, 2024

NCAlgebra / NC

NCAlgebra - Non Commutative Algebra Package for Mathematica

Mercury 159 24 Updated Nov 4, 2023

jettify / pytorch-optimizer

torch-optimizer -- collection of optimizers for Pytorch

Python 3,063 298 Updated Mar 22, 2024

SakanaAI / AI-Scientist

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑‍🔬

Jupyter Notebook 8,558 1,232 Updated Nov 8, 2024

huggingface / alignment-handbook

Robust recipes to align language models with human and AI preferences

Python 4,874 420 Updated Nov 21, 2024

OptimalScale / LMFlow

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,322 827 Updated Jan 8, 2025

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,121 76 Updated Dec 12, 2024

allenai / reward-bench

RewardBench: the first evaluation tool for reward models.

Python 481 56 Updated Dec 11, 2024

huggingface / trl

Train transformer language models with reinforcement learning.

Python 10,545 1,363 Updated Jan 8, 2025

kwignb / NeuralTangentKernel-Papers

Neural Tangent Kernel Papers

100 10 Updated Jan 8, 2025

kevinzhou497 / distcb

Python 5 1 Updated May 24, 2023

liziniu / ReMax

Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)

Python 167 13 Updated Dec 16, 2023

nikhil-ghosh-berkeley / loraplus

Python 206 16 Updated Jun 24, 2024

microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,070 698 Updated Dec 17, 2024

gpleiss / temperature_scaling

A simple way to calibrate your neural network.

Python 1,120 161 Updated Aug 24, 2021

royhowie / mth532

differential topology

TeX 5 2 Updated Apr 20, 2017

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

Python 6,015 682 Updated Nov 14, 2024

BillHuang01 / condor_tutorial

Batchfile 2 Updated Mar 20, 2021

locuslab / edge-of-stability

Python 62 19 Updated Dec 7, 2024

kaixindelele / ChatPaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研，利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 18,653 1,945 Updated Apr 4, 2024

mltheory / CS7545

TeX 17 8 Updated Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly