Stars
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs
The source code for the blog post The 37 Implementation Details of Proximal Policy Optimization
A series of large language models developed by Baichuan Intelligent Technology
NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark
A 13B large language model developed by Baichuan Intelligent Technology
A large-scale 7B pretraining language model developed by BaiChuan-Inc.