Stars
[CVPR2024] MMA-Diffusion: MultiModal Attack on Diffusion Models
Official Implementation for "Towards Reliable Verification of Unauthorized Data Usage in Personalized Text-to-Image Diffusion Models" (IEEE S&P 2025).
This is the code repository of our submission: Understanding the Dark Side of LLMs’ Intrinsic Self-Correction.
[NeurIPS 2024 D&B Track] UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models by Yihua Zhang, Chongyu Fan, Yimeng Zhang, Yuguang Yao, Jinghan Jia, Jiancheng …
Official implementation of NeurIPS'24 paper "Defensive Unlearning with Adversarial Training for Robust Concept Erasure in Diffusion Models". This work adversarially unlearns the text encoder to enh…
A survey on harmful fine-tuning attack for large language model
A collection of awesome text-to-image generation studies.
A watermarking tool to protect artworks from AIGC-driven style mimicry (e.g. LoRA)
Official Code for ART: Automatic Red-teaming for Text-to-Image Models to Protect Benign Users (NeurIPS 2024)
PyTorch implementation of adversarial attacks [torchattacks]
A collection of resources on attacks and defenses targeting text-to-image diffusion models
[CCS'24] SafeGen: Mitigating Unsafe Content Generation in Text-to-Image Models
This is a collection of awesome papers I have read (carefully or roughly) in the fields of security in diffusion models. Any suggestions and comments are welcome ([email protected]).
A curated list of safety-related papers, articles, and resources focused on Large Language Models (LLMs). This repository aims to provide researchers, practitioners, and enthusiasts with insights i…
Source code and scripts for the paper "Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks"
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
[MM24 Oral] Identity-Driven Multimedia Forgery Detection via Reference Assistance
[NeurIPS 2024] This is the official repo of the paper "Lips Are Lying: Spotting the Temporal Inconsistency between Audio and Visual in Lip-syncing DeepFakes".
A curated list of papers & resources linked to data poisoning, backdoor attacks and defenses against them (no longer maintained)
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
Official implementation of "Active Image Indexing"
MulimgViewer is a multi-image viewer that can open multiple images in one interface, which is convenient for image comparison and image stitching.