This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & V…

735 42 Updated Oct 22, 2024

swtheing / PF-PPO-RLHF

Python 30 1 Updated Sep 14, 2024

Allenpandas / BJTU-CS-Notebook

👨‍🎓 北京交通大学计算机科学与技术学院研究生课程资料、笔记、回忆和整理的期末考试卷及课程作业。希望对你们有所帮助❤️，如果喜欢记得给个star🌟

Jupyter Notebook 154 31 Updated Sep 19, 2024

cjh0613 / XuetangX-GetAnswer

获取学堂在线的练习答案

Python 3 5 Updated May 11, 2020

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,094 76 Updated Dec 12, 2024

RLHFlow / Online-RLHF

A recipe for online RLHF and online iterative DPO.

Python 456 51 Updated Dec 28, 2024

datawhalechina / easy-rl

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 9,946 1,909 Updated Jan 19, 2025

hangsz / reinforcement_learning

[动手学强化学习]系列，基于pytorch。

Python 54 25 Updated Jun 2, 2021

ailabx / ailabx

AI量化实验室，专注将前沿人工智能技术(深度学习/强化学习/知识图谱)应用于金融量化投资。

HTML 718 178 Updated Sep 19, 2023

codefuse-ai / Awesome-Code-LLM

[TMLR] A curated list of language modeling researches for code (and other software engineering activities), plus related datasets.

1,970 125 Updated Jan 17, 2025

GAIR-NLP / BeHonest

BeHonest: Benchmarking Honesty in Large Language Models

JavaScript 31 Updated Aug 15, 2024

NexaAI / Awesome-LLMs-on-device

Awesome LLMs on Device: A Comprehensive Survey

922 101 Updated Jan 12, 2025

mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

C++ 793 77 Updated Jul 4, 2024

GAIR-NLP / alignment-for-honesty

Python 71 2 Updated May 22, 2024

andyzoujm / representation-engineering

Representation Engineering: A Top-Down Approach to AI Transparency

Jupyter Notebook 777 89 Updated Aug 14, 2024

ydyjya / Awesome-LLM-Safety

A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…

1,124 57 Updated Jan 19, 2025

WECENG / ticket-purchase

大麦自动抢票，支持人员、城市、日期场次、价格选择

Python 879 134 Updated Apr 28, 2024

IAAR-Shanghai / UHGEval

[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.

Python 159 17 Updated Nov 12, 2024

ADaM-BJTU / W2SG

The code of “Improving Weak-to-Strong Generalization with Scalable Oversight and Ensemble Learning”

Python 15 1 Updated Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FairyLinya

Block or report FairyLinya

Stars

ADaM-BJTU / OpenRFT

reasoning-survey / Awesome-Reasoning-Foundation-Models

ADaM-BJTU / O1-CODER

princeton-nlp / SimPO

eric-mitchell / direct-preference-optimization

dvlab-research / Step-DPO

RUCAIBox / RLMEC

wutaiqiang / LLM_KD_AKL

OpenCoder-llm / OpenCoder-llm

OpenBMB / UltraFeedback

JetBrains-Research / lca-baselines

Tebmer / Awesome-Knowledge-Distillation-of-LLMs