peilin-chen

Peilin Chen peilin-chen

Ph.D. student at the University of Virginia. His research interests are Digital/Mixed-signal IC Design, AI Chips, and Computer Architecture.

29 followers · 1 following

Charlottesville, VA, USA

Achievements

peilin-chen.github.io Public

JavaScript MIT License Updated Jan 13, 2025
ml-retreat Public
Forked from hesamsheikh/ml-retreat

Machine Learning Journal for Intermediate to Advanced Topics.

Jupyter Notebook Updated Nov 5, 2024
siliwiz Public
Forked from TinyTapeout/siliwiz

Silicon Layout Wizard

JavaScript Other Updated Sep 14, 2024
gemmini Public
Forked from ucb-bar/gemmini

Berkeley's Spatial Array Generator

Scala 1 Other Updated Aug 14, 2024
qserve Public
Forked from mit-han-lab/qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python Apache License 2.0 Updated Aug 13, 2024
H2O Public
Forked from FMInference/H2O

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.

Python Updated Aug 1, 2024
KIVI Public
Forked from jy-yuan/KIVI

KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python MIT License Updated Jul 27, 2024
OmniQuant Public
Forked from OpenGVLab/OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python MIT License Updated Jul 24, 2024
vortex Public
Forked from vortexgpgpu/vortex

Verilog Apache License 2.0 Updated Jul 19, 2024
llm-awq Public
Forked from mit-han-lab/llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Python MIT License Updated Jul 16, 2024
ventus-gpgpu Public
Forked from THU-DSP-LAB/ventus-gpgpu

GPGPU processor supporting RISCV-V extension, developed with Chisel HDL

Scala Other Updated Jul 10, 2024
ventus-gpgpu-verilog Public
Forked from THU-DSP-LAB/ventus-gpgpu-verilog

GPGPU supporting RISCV-V, developed with verilog HDL

Verilog Updated Jul 8, 2024
TensorRT Public
Forked from pytorch/TensorRT

PyTorch/TorchScript/FX compiler for NVIDIA GPUs using TensorRT

Python BSD 3-Clause "New" or "Revised" License Updated Jul 5, 2024
TinyChatEngine Public
Forked from mit-han-lab/TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

C++ MIT License Updated Jul 4, 2024
Zhulong-RISCV-CPU Public

CPU Design Based on RISCV ISA

integrated-circuits processor-design

Verilog 83 20 MIT License Updated Jun 14, 2024
flash-attention Public
Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python BSD 3-Clause "New" or "Revised" License Updated Jun 6, 2024
AISystem Public
Forked from chenzomi12/AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook Apache License 2.0 Updated May 22, 2024
hardware-accelerator-for-LLM Public
Forked from Soumya2754/hardware-accelerator-for-LLM

Major project - kannada LLM for farmers

Verilog Updated May 20, 2024
AutoSmoothQuant Public
Forked from AniZpZ/AutoSmoothQuant

An easy-to-use package for implementing SmoothQuant for LLMs

Python MIT License Updated May 18, 2024
basejump_stl Public
Forked from bespoke-silicon-group/basejump_stl

BaseJump STL: A Standard Template Library for SystemVerilog

SystemVerilog Other Updated May 14, 2024
spatten-llm Public
Forked from mit-han-lab/spatten

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning

Scala MIT License Updated May 3, 2024
smoothquant Public
Forked from mit-han-lab/smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python MIT License Updated Apr 28, 2024
tiny-gpu Public
Forked from adam-maj/tiny-gpu

A minimal GPU design in Verilog to learn how GPUs work from the ground up

SystemVerilog 1 Updated Apr 28, 2024
metaseq Public
Forked from facebookresearch/metaseq

Repo for external large-scale work

Python MIT License Updated Apr 27, 2024
LLMsPracticalGuide Public
Forked from Mooler0410/LLMsPracticalGuide

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

Updated Apr 22, 2024
KVQuant Public
Forked from SqueezeAILab/KVQuant

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Python Updated Apr 19, 2024
FlexGen Public
Forked from FMInference/FlexLLMGen

Running large language models on a single GPU for throughput-oriented scenarios.

Python Apache License 2.0 Updated Apr 19, 2024
llama3 Public
Forked from meta-llama/llama3

The official Meta Llama 3 GitHub site

Python Other Updated Apr 19, 2024
llama Public
Forked from meta-llama/llama

Inference code for Llama models

Python Other Updated Apr 10, 2024
gpgpu-sim_distribution Public
Forked from gpgpu-sim/gpgpu-sim_distribution

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as…

C++ Other Updated Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peilin Chen peilin-chen

Achievements

Achievements

Block or report peilin-chen

peilin-chen.github.io Public

ml-retreat Public

siliwiz Public

gemmini Public

qserve Public

H2O Public

KIVI Public

OmniQuant Public

vortex Public

llm-awq Public

ventus-gpgpu Public

ventus-gpgpu-verilog Public

TensorRT Public

TinyChatEngine Public

Zhulong-RISCV-CPU Public

flash-attention Public

AISystem Public

hardware-accelerator-for-LLM Public

AutoSmoothQuant Public

basejump_stl Public

spatten-llm Public

smoothquant Public

tiny-gpu Public

metaseq Public

LLMsPracticalGuide Public

KVQuant Public

FlexGen Public

llama3 Public

llama Public

gpgpu-sim_distribution Public