yifeiwang77

✨

May Sparkles Everyday

Yifei Wang yifeiwang77

✨

May Sparkles Everyday

Postdoc @ MIT CSAIL

52 followers · 40 following

Beijing
yifeiwang77.com

Achievements

Highlights

Lists (1)

Sort

🔮 Future ideas

1 repository

Starred repositories

PKU-ML / LongPPL

Python 27 1 Updated Nov 15, 2024

qishenghu / InstructCoder

InstructCoder: Instruction Tuning Large Language Models for Code Editing | Oral ACL-2024 srw

Python 57 6 Updated Oct 4, 2024

kaotty / Understanding-ESSL

Python 3 Updated Nov 26, 2024

yifeiwang77 / Self-Correction

Python 19 1 Updated Nov 3, 2024

weizeming / MILE

A Mutation Testing Framework of In-Context Learning Systems

Python 3 Updated Sep 30, 2024

openai / safety-rbr-code-and-data

Code and example data for the paper: Rule Based Rewards for Language Model Safety

Jupyter Notebook 178 16 Updated Jul 19, 2024

facebookresearch / clip-rocket

Code release for "Improved baselines for vision-language pre-training"

Python 59 1 Updated May 6, 2024

EleutherAI / delphi

Jupyter Notebook 149 17 Updated Feb 12, 2025

facebookresearch / schedule_free

Schedule-Free Optimization in PyTorch

Python 2,092 71 Updated Dec 2, 2024

fla-org / flash-linear-attention

🚀 Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Python 1,887 113 Updated Feb 11, 2025

meta-llama / llama-models

Utilities intended for use with Llama models.

Python 5,779 976 Updated Feb 7, 2025

GeorgeMLP / laplacian-canonization

Official code for NeurIPS 2023 paper "Laplacian Canonization: A Minimalist Approach to Sign and Basis Invariant Spectral Embedding".

Python 2 Updated Jan 20, 2024

explanare / ravel

Evaluate interpretability methods on localizing and disentangling concepts in LLMs.

Python 37 6 Updated Oct 5, 2024

eloialonso / iris

Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.

Python 826 84 Updated Oct 14, 2024

ajyl / dpo_toxic

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity.

Jupyter Notebook 62 14 Updated Nov 5, 2024

ndif-team / nnsight

The nnsight package enables interpreting and manipulating the internals of deep learned models.

Jupyter Notebook 485 42 Updated Feb 12, 2025

OSU-NLP-Group / AmpleGCG

AmpleGCG: Learning a Universal and Transferable Generator of Adversarial Attacks on Both Open and Closed LLM

Python 54 5 Updated Nov 3, 2024

jacobdunefsky / transcoder_circuits

Jupyter Notebook 54 9 Updated Nov 17, 2024

TransformerLensOrg / TransformerLens

A library for mechanistic interpretability of GPT-style language models

Python 1,846 332 Updated Feb 8, 2025

saprmarks / feature-circuits

Python 139 27 Updated Jan 27, 2025

revalo / tree-diffusion

Diffusion on syntax trees for program synthesis

Python 441 27 Updated Jun 27, 2024

ytongbai / LVM

Python 1,787 55 Updated Jun 28, 2024

SafeAILab / RAIN

[ICLR'24] RAIN: Your Language Models Can Align Themselves without Finetuning

Python 88 4 Updated May 23, 2024

anthropics / toy-models-of-superposition

Notebooks accompanying Anthropic's "Toy Models of Superposition" paper

Jupyter Notebook 109 13 Updated Sep 14, 2022

openai / automated-interpretability

Python 988 116 Updated Mar 6, 2024

OODRobustBench / OODRobustBench

OODRobustBench: a Benchmark and Large-Scale Analysis of Adversarial Robustness under Distribution Shift. ICML 2024 and ICLRW-DMLR 2024

Python 19 1 Updated Jul 25, 2024

Xnhyacinth / Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,217 43 Updated Feb 7, 2025

kylehkhsu / tripod

Python 8 1 Updated Apr 19, 2024

JailbreakBench / artifacts

Jailbreak artifacts for JailbreakBench

48 8 Updated Nov 6, 2024

facebookresearch / ICRM

Context is Environment

Python 9 Updated Mar 8, 2024

Yifei Wang yifeiwang77

Highlights

Lists (1)

🔮 Future ideas

Starred repositories

vq-vae