Skip to content
View fbarez's full-sized avatar
🌍
🌍

Highlights

  • Pro

Organizations

@torrvision @apartresearch @EdinAISafetyHub

Block or report fbarez

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code for the EMNLP 2024 paper "Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps"

Python 119 7 Updated Aug 13, 2024

Simple, unified interface to multiple Generative AI providers

Python 11,813 1,154 Updated Mar 26, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,393 270 Updated Mar 24, 2025

An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…

TypeScript 15,059 1,545 Updated Mar 24, 2025

A curated list of LLM Interpretability related material - Tutorial, Library, Survey, Paper, Blog, etc..

213 8 Updated Mar 20, 2025

Benchmark to evaluate different LLMs for pragmatic (individual/context-specific) harms

Jupyter Notebook 3 1 Updated Dec 15, 2024

Sparse autoencoders

Python 1 Updated Oct 12, 2024

aider is AI pair programming in your terminal

Python 30,128 2,728 Updated Mar 27, 2025
Python 124 8 Updated Mar 24, 2025

LLM101n: Let's build a Storyteller

32,982 1,803 Updated Aug 1, 2024

HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal

Jupyter Notebook 593 84 Updated Aug 16, 2024

We focus on the behavior of AI, and the Cyber Soul. We investigate the alignment dynamics with deliberately designed experiments.

CSS 3 Updated Jul 14, 2024

Inspect: A framework for large language model evaluations

Python 847 198 Updated Mar 28, 2025

A trivial programmatic Llama 3 jailbreak. Sorry Zuck!

Python 541 61 Updated Jan 26, 2025

A collection of different ways to implement accessing and modifying internal model activations for LLMs

Jupyter Notebook 14 Updated Oct 18, 2024

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

Jupyter Notebook 1,413 169 Updated Mar 21, 2025

Evaluating LLMs with fewer examples

Jupyter Notebook 148 16 Updated Apr 12, 2024

Experimental AI Agents Framework

C# 263 60 Updated Feb 23, 2025

Explore what LLMs are really leanring over SFT

Python 28 2 Updated Mar 30, 2024

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 16,547 2,394 Updated Mar 26, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,654 4,321 Updated Mar 28, 2025

Finding trojans in aligned LLMs. Official repository for the competition hosted at SaTML 2024.

Python 109 9 Updated Jun 13, 2024

Representation Engineering: A Top-Down Approach to AI Transparency

Jupyter Notebook 810 94 Updated Aug 14, 2024
Jupyter Notebook 14 2 Updated Mar 31, 2024

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,493 317 Updated Jul 15, 2024

Solve puzzles. Learn CUDA.

Jupyter Notebook 10,801 834 Updated Sep 1, 2024

Conference schedule, top papers, and analysis of the data for NeurIPS 2023!

Jupyter Notebook 119 7 Updated Dec 15, 2023

List of papers on hallucination detection in LLMs.

811 64 Updated Mar 7, 2025

An extension of the PyMARL codebase that includes additional algorithms and environment support

Python 1 Updated Oct 2, 2023
Next