Skip to content
View yifeiwang77's full-sized avatar
May Sparkles Everyday
May Sparkles Everyday

Highlights

  • Pro

Block or report yifeiwang77

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 27 1 Updated Nov 15, 2024

InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw

Python 57 6 Updated Oct 4, 2024
Python 3 Updated Nov 26, 2024

A Mutation Testing Framework of In-Context Learning Systems

Python 3 Updated Sep 30, 2024

Code and example data for the paper: Rule Based Rewards for Language Model Safety

Jupyter Notebook 178 16 Updated Jul 19, 2024

Code release for "Improved baselines for vision-language pre-training"

Python 59 1 Updated May 6, 2024
Jupyter Notebook 149 17 Updated Feb 12, 2025

Schedule-Free Optimization in PyTorch

Python 2,092 71 Updated Dec 2, 2024

🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,887 113 Updated Feb 11, 2025

Utilities intended for use with Llama models.

Python 5,779 976 Updated Feb 7, 2025

Official code for NeurIPS 2023 paper "Laplacian Canonization: A Minimalist Approach to Sign and Basis Invariant Spectral Embedding".

Python 2 Updated Jan 20, 2024

Evaluate interpretability methods on localizing and disentangling concepts in LLMs.

Python 37 6 Updated Oct 5, 2024

Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.

Python 826 84 Updated Oct 14, 2024

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.

Jupyter Notebook 62 14 Updated Nov 5, 2024

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Jupyter Notebook 485 42 Updated Feb 12, 2025

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

Python 54 5 Updated Nov 3, 2024
Jupyter Notebook 54 9 Updated Nov 17, 2024

A library for mechanistic interpretability of GPT-style language models

Python 1,846 332 Updated Feb 8, 2025
Python 139 27 Updated Jan 27, 2025

Diffusion on syntax trees for program synthesis

Python 441 27 Updated Jun 27, 2024
Python 1,787 55 Updated Jun 28, 2024

[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning

Python 88 4 Updated May 23, 2024

Notebooks accompanying Anthropic's "Toy Models of Superposition" paper

Jupyter Notebook 109 13 Updated Sep 14, 2022

OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift. ICML 2024 and ICLRW-DMLR 2024

Python 19 1 Updated Jul 25, 2024

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,217 43 Updated Feb 7, 2025
Python 8 1 Updated Apr 19, 2024

Jailbreak artifacts for JailbreakBench

48 8 Updated Nov 6, 2024

Context is Environment

Python 9 Updated Mar 8, 2024
Next