Stars
Sparsify transformers with SAEs and transcoders
A fast inference library for running LLMs locally on modern consumer-class GPUs
Experimental LLM Inference UX to aid in creative writing
Training LLMs with QLoRA + FSDP
A library for mechanistic interpretability of GPT-style language models
The first open-source Artificial Narrow Intelligence generalist agentic framework Computer-Using-Agent that fully operates graphical-user-interfaces (GUIs) by using only natural language. Uses Visu…
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
Fast Hadamard transform in CUDA, with a PyTorch interface