-
Beijing University of Post and Telecommunications
- Beijing
- https://www.zhihu.com/people/warrior-18-53
Stars
🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
The code for AED which's a method to help LLM defend jailbreaks
[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
Using sparse coding to find distributed representations used by neural networks.
JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]
Repository for "StrongREJECT for Empty Jailbreaks" paper
Train transformer language models with reinforcement learning.
An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.
Papers and resources related to the security and privacy of LLMs 🤖
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
Reference implementation for DPO (Direct Preference Optimization)
Set of tools to assess and improve LLM security.
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>
Official inference library for Mistral models