xie-1399

Follow

Icser xie-1399

Follow

Computer Arch & Domain Accelerator & Machine Learning :)

4 followers · 3 following

BUAA
中国

Achievements

Achievements

Stars

intel / neural-compressor

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

Python 2,185 252 Updated Oct 9, 2024

dpretet / async_fifo

A dual clock asynchronous FIFO written in verilog, tested with Icarus Verilog

Verilog 251 75 Updated Apr 30, 2024

feifeibear / LLMSpeculativeSampling

Fast inference from large lauguage models via speculative decoding

Python 532 52 Updated Aug 22, 2024

karpathy / llama2.c

Inference Llama 2 in one file of pure C

C 17,279 2,057 Updated Aug 6, 2024

Tiiiger / QPyTorch

Low Precision Arithmetic Simulation in PyTorch

Python 263 75 Updated May 20, 2024

jadore801120 / attention-is-all-you-need-pytorch

A PyTorch implementation of the Transformer model in "Attention is All You Need".

Python 8,787 1,974 Updated Apr 16, 2024

hemingkx / SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

384 15 Updated Oct 9, 2024

October2001 / Awesome-KV-Cache-Compression

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

61 1 Updated Oct 7, 2024

maomran / softmax

Verilog implementation of Softmax function

Verilog 47 16 Updated Jul 27, 2022

arasi15 / CNN-Accelerator-Implementation-based-on-Eyerissv2

Verilog 87 10 Updated Jul 22, 2020

Zeyi-Lin / HivisionIDPhotos

⚡️HivisionIDPhotos: a lightweight and efficient AI ID photos tools. 一个轻量级的AI证件照制作算法。

Python 10,875 1,069 Updated Sep 28, 2024

LetheSec / HuggingFace-Download-Accelerator

利用HuggingFace的官方下载工具从镜像网站进行高速下载。

Python 768 71 Updated Sep 5, 2024

NielsRogge / Transformers-Tutorials

This repository contains demos I made with the Transformers library by HuggingFace.

Jupyter Notebook 9,196 1,418 Updated Aug 8, 2024

openxla / xla

A machine learning compiler for GPUs, CPUs, and ML accelerators

C++ 2,624 413 Updated Oct 10, 2024

sophgo / LLM-TPU

Run generative AI models in sophgo BM1684X

Python 107 17 Updated Oct 9, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 28,080 4,149 Updated Oct 9, 2024

yunjey / pytorch-tutorial

PyTorch Tutorial for Deep Learning Researchers

Python 29,957 8,104 Updated Aug 15, 2023

mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python 2,403 185 Updated Jul 16, 2024

snu-comparch / Tender

Tender: Accelerating Large Language Models via Tensor Decompostion and Runtime Requantization (ISCA'24)

Python 9 1 Updated Jul 4, 2024

clevercool / ANT-Quantization

Python 77 14 Updated Nov 17, 2023

RUCAIBox / LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

Python 10,151 801 Updated Aug 20, 2024

Lightning-AI / litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 10,310 1,021 Updated Oct 9, 2024

Lightning-AI / lit-llama

Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Python 5,970 516 Updated Sep 6, 2024

meta-llama / llama

Inference code for Llama models

Python 55,925 9,518 Updated Aug 18, 2024

j2kun / mlir-tutorial

MLIR For Beginners tutorial

C++ 761 62 Updated Sep 30, 2024

onnx / onnx-mlir

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 749 316 Updated Oct 9, 2024

onnx / models

A collection of pre-trained, state-of-the-art models in the ONNX format

Jupyter Notebook 7,802 1,394 Updated Apr 30, 2024

tomverbeure / math

SpinalHDL Hardware Math Library

Scala 77 13 Updated Jul 12, 2024

calyxir / calyx

Intermediate Language (IL) for Hardware Accelerator Generators

Rust 490 50 Updated Oct 9, 2024

d2l-ai / d2l-tvm

Dive into Deep Learning Compiler

Python 640 98 Updated Jun 19, 2022