zjersey

Jersey zjersey

22 followers · 24 following

@bytedance @Thinklab-SJTU
Shanghai

Achievements

Stars

triton-lang / triton

Development repository for the Triton language and compiler

MLIR 15,645 1,990 Updated May 23, 2025

thu-pacman / chitu

High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.

Python 1,117 74 Updated May 22, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Python 5,368 598 Updated May 20, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,680 772 Updated May 23, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

Cuda 11,564 835 Updated Apr 29, 2025

deepseek-ai / DeepSeek-V3

Python 97,034 15,780 Updated Apr 9, 2025

deepseek-ai / DeepSeek-R1

89,409 11,555 Updated Apr 9, 2025

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 10,537 1,444 Updated May 23, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 47,887 7,558 Updated May 23, 2025

Zhen-Dong / Awesome-Quantization-Papers

List of papers related to neural network quantization in recent AI conferences and journals.

630 49 Updated Mar 27, 2025

BlinkDL / RWKV-LM

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,627 915 Updated May 22, 2025

dair-ai / ML-Papers-of-the-Week

🔥Highlighting the top ML papers every week.

11,271 686 Updated Apr 11, 2025

bitsandbytes-foundation / bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Python 7,057 698 Updated May 22, 2025

Hannibal046 / Awesome-LLM

Awesome-LLM: a curated list of Large Language Model

23,452 1,962 Updated May 9, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 12,403 2,782 Updated May 22, 2025

OpenMOSS / MOSS

An open-source tool-augmented conversational language model from Fudan University

Python 12,049 1,147 Updated Jul 13, 2024

ggml-org / llama.cpp

LLM inference in C/C++

C++ 80,719 11,873 Updated May 23, 2025

Guangxuan-Xiao / torch-int

This repository contains integer operators on GPUs for PyTorch.

Python 205 54 Updated Sep 29, 2023

mit-han-lab / smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Python 1,409 174 Updated Jul 12, 2024

deepspeedai / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 38,510 4,383 Updated May 23, 2025

forthespada / CampusShame

互联网仍有记忆！那些曾经在校招过程中毁过口头offer、意向书、三方的公司！纵然人微言轻，也想尽绵薄之力！

3,334 160 Updated Oct 20, 2024

The-Run-Philosophy-Organization / run

润学全球官方指定GITHUB，整理润学宗旨、纲领、理论和各类润之实例；解决为什么润，润去哪里，怎么润三大问题；并成为新中国人的核心宗教，核心信念。

32,045 2,615 Updated Jul 31, 2024

OpenNMT / CTranslate2

Fast inference engine for Transformer models

C++ 3,812 359 Updated Apr 8, 2025

MediaBrain-SJTU / MemoNet

[CVPR2022] Remember Intentions: Retrospective-Memory-based Trajectory Prediction

Python 129 16 Updated Sep 11, 2022

MediaBrain-SJTU / GroupNet

[CVPR22] GroupNet: Multiscale Hypergraph Neural Networks for Trajectory Prediction with Relational Reasoning

Python 122 26 Updated Feb 11, 2023

Thinklab-SJTU / pygmtools

A Python Graph Matching Toolkit.

Python 332 19 Updated Oct 21, 2024

NVIDIA / FasterTransformer

Transformer related optimization, including BERT, GPT

C++ 6,165 905 Updated Mar 27, 2024

alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…

C++ 11,040 1,861 Updated May 16, 2025

google / gemmlowp

Low-precision matrix multiplication

C++ 1,803 458 Updated Jan 29, 2024

tpoisonooo / chgemm

symmetric int8 gemm

Assembly 66 12 Updated Jun 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jersey zjersey

Achievements

Achievements

Block or report zjersey

Stars

triton-lang / triton

thu-pacman / chitu

deepseek-ai / DeepGEMM

deepseek-ai / DeepEP

deepseek-ai / FlashMLA

deepseek-ai / DeepSeek-V3

deepseek-ai / DeepSeek-R1

NVIDIA / TensorRT-LLM

vllm-project / vllm

Zhen-Dong / Awesome-Quantization-Papers

BlinkDL / RWKV-LM

dair-ai / ML-Papers-of-the-Week

bitsandbytes-foundation / bitsandbytes

Hannibal046 / Awesome-LLM

NVIDIA / Megatron-LM

OpenMOSS / MOSS

ggml-org / llama.cpp

Guangxuan-Xiao / torch-int

mit-han-lab / smoothquant

deepspeedai / DeepSpeed

forthespada / CampusShame

The-Run-Philosophy-Organization / run

OpenNMT / CTranslate2

MediaBrain-SJTU / MemoNet

MediaBrain-SJTU / GroupNet

Thinklab-SJTU / pygmtools

NVIDIA / FasterTransformer

alibaba / MNN

google / gemmlowp

tpoisonooo / chgemm