Stars
AIFoundation 主要是指AI系统遇到大模型,从底层到上层如何系统级地支持大模型训练和推理,全栈的核心技术。
This is the top-level repository for the Accel-Sim framework.
A bunch of coding tutorials for my Youtube videos on Neural Network Quantization.
Course materials for MIT6.5940: TinyML and Efficient Deep Learning Computing
《Effective Modern C++》- 完成翻译
Fast SpMM implementation on GPUs for GNN (IPDPS'23)
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Code for ACM MobiCom 2024 paper "FlexNN: Efficient and Adaptive DNN Inference on Memory-Constrained Edge Devices"
中文的C++ Template的教学指南。与知名书籍C++ Templates不同,该系列教程将C++ Templates作为一门图灵完备的语言来讲授,以求帮助读者对Meta-Programming融会贯通。(正在施工中)
heterogeneity-aware-lowering-and-optimization
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
This repository contains tutorials and examples for Triton Inference Server
My learning notes/codes for ML SYS.
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
PKU OS course project and notes based on Nachos and XV6