-
University of Alberta
- Edmonton, Canada
Stars
Repository to host and maintain scale-sim-v2 code
An integrated cache and memory access time, cycle time, area, leakage, and dynamic power model
Reorder-based post-training quantization for large language model
Using ideas from product quantization for state-of-the-art neural network compression.
Fast and accurate DRAM power and energy estimation tool
New generation entropy codecs : Finite State Entropy and Huff0
The official implementation of the EMNLP 2023 paper LLM-FP4
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Code accompanying the paper "Massive Activations in Large Language Models"
This is a collection of our zero-cost NAS and efficient vision applications.
Post-Training Quantization for Vision transformers.
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
A paper list of some recent Transformer-based CV works.
Code for "Atalanta: A Bit is Worth a ``Thousand'' Tensor Values"
[IJCAI 2022] FQ-ViT: Post-Training Quantization for Fully Quantized Vision Transformer
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
[ICCV 2023] RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
A bit-level sparsity-awared multiply-accumulate process element.
A visualization and transformation of pytorch model
PDPU: An Open-Source Posit Dot-Product Unit for Deep Learning Applications