Stars
Out-of-distribution detection, robustness, and generalization resources. The repository contains a curated list of papers, tutorials, books, videos, articles and open-source libraries etc
Röttger et al. (2023): "XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models"
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
NCAlgebra - Non Commutative Algebra Package for Mathematica
torch-optimizer -- collection of optimizers for Pytorch
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Robust recipes to align language models with human and AI preferences
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Recipes to train reward model for RLHF.
RewardBench: the first evaluation tool for reward models.
Train transformer language models with reinforcement learning.
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
A simple way to calibrate your neural network.
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复