Skip to content
View ydyjya's full-sized avatar

Block or report ydyjya

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🌟 The Multi-Agent Framework: First AI Software Company, Towards Natural Language Programming

Python 47,776 5,667 Updated Feb 19, 2025

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 313 9 Updated Feb 12, 2025
Python 14 Updated Oct 19, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,497 362 Updated Feb 22, 2025

The code for AED which's a method to help LLM defend jailbreaks

Python 4 Updated Jul 29, 2024

[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Python 70 9 Updated Oct 4, 2024

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

52 3 Updated Feb 17, 2025

[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"

104 5 Updated Sep 21, 2024

Using sparse coding to find distributed representations used by neural networks.

Jupyter Notebook 214 29 Updated Nov 10, 2023
Python 423 45 Updated Jul 19, 2024
Jupyter Notebook 41 3 Updated Jun 13, 2024

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]

Python 300 33 Updated Sep 26, 2024

Repository for "StrongREJECT for Empty Jailbreaks" paper

Jupyter Notebook 118 5 Updated Nov 3, 2024

LLM training in simple, raw C/CUDA

Cuda 25,791 2,955 Updated Oct 2, 2024

Train transformer language models with reinforcement learning.

Python 11,978 1,613 Updated Feb 25, 2025

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,356 833 Updated Feb 22, 2025

Papers and resources related to the security and privacy of LLMs 🤖

Python 479 35 Updated Nov 27, 2024

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 518 48 Updated Feb 18, 2025

Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!

HTML 283 19 Updated Oct 10, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 47,055 4,993 Updated Jan 22, 2025

SC-Safety: 中文大模型多轮对抗安全基准

119 9 Updated Mar 15, 2024

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

Python 18,743 1,945 Updated Apr 4, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,397 200 Updated Aug 11, 2024

Set of tools to assess and improve LLM security.

Python 2,918 486 Updated Feb 14, 2025

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,344 731 Updated Aug 5, 2024

Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"

988 52 Updated Nov 21, 2024

The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>

332 29 Updated Apr 25, 2024

Official inference library for Mistral models

Jupyter Notebook 10,015 893 Updated Nov 12, 2024
Next