Stars
- All languages
- Assembly
- Astro
- C
- C#
- C++
- CSS
- Clojure
- Emacs Lisp
- Go
- HTML
- Handlebars
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- MATLAB
- Makefile
- Markdown
- Move
- PHP
- Perl
- PowerShell
- Processing
- Python
- Ren'Py
- Roff
- Ruby
- Rust
- SCSS
- Scheme
- Shell
- Smalltalk
- TeX
- TypeScript
- V
- Verilog
- Vim Script
- Vim Snippet
- Vue
Initialization using Update Approximation is a Silver Bullet for Extremely Efficient Low-Rank Fine-Tuning
This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
VPTQ, A Flexible and Extreme low-bit quantization algorithm
Agent S: an open agentic framework that uses computers like a human
PyTorch Implementation for Hyperbolic Fine-tuning for LLMs
TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
[ICML2024 (Oral)] Official PyTorch implementation of DoRA: Weight-Decomposed Low-Rank Adaptation
Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"
[ICML 2024 Oral] This project is the official implementation of our Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Distributed LLM and StableDiffusion inference for mobile, desktop and server.
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Official JAX implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
A State-Space Model with Rational Transfer Function Representation.
PyTorch code for Q-DiT: Accurate Post-Training Quantization for Diffusion Transformers
mi-optimize is a versatile tool designed for the quantization and evaluation of large language models (LLMs). The library's seamless integration of various quantization methods and evaluation techn…
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
The code for the paper "Pre-trained Vision-Language Models Learn Discoverable Concepts"
[ICML 2024]: Official implementation for the paper: "Consistent Diffusion Meets Tweedie"
AI Native Data App Development framework with AWEL(Agentic Workflow Expression Language) and Agents
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?