Lists (5)
Sort Name ascending (A-Z)
Stars
- All languages
- ATS
- C
- C#
- C++
- CSS
- Coq
- Crystal
- Cuda
- Emacs Lisp
- Erlang
- GCC Machine Description
- Go
- HTML
- Haskell
- Java
- JavaScript
- Julia
- Jupyter Notebook
- LLVM
- Lua
- Makefile
- Mustache
- OCaml
- Objective-C
- PHP
- PLpgSQL
- Perl
- Python
- Roff
- Ruby
- Rust
- SCSS
- Sass
- Scala
- Scheme
- Shell
- Standard ML
- TeX
- Twig
- TypeScript
- Vim Script
- Vue
- WebAssembly
An extremely fast Python package and project manager, written in Rust.
Fast file synchronization and network forwarding for remote development
Git alias commands for faster easier version control
veRL: Volcano Engine Reinforcement Learning for LLM
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
Sample codes for my CUDA programming book
Herald: Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024)
【PyTorch】Easy-to-use,Modular and Extendible package of deep-learning based CTR models.
Fast and memory-efficient exact attention
Tile primitives for speedy kernels
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Efficient Triton Kernels for LLM Training
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
A retargetable MLIR-based machine learning compiler and runtime toolkit.
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
PyTorch package for the discrete VAE used for DALL·E.
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
A list of awesome compiler projects and papers for tensor computation and deep learning.
A Cloud Native Batch System (Project under CNCF)
Open source platform for the machine learning lifecycle
An implementation of a deep learning recommendation model (DLRM)