Skip to content
View wweic's full-sized avatar
😀
Hacking
😀
Hacking

Organizations

@dreal

Block or report wweic

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An extremely fast Python package and project manager, written in Rust.

Rust 37,412 1,029 Updated Jan 29, 2025

Fast file synchronization and network forwarding for remote development

Go 3,490 157 Updated Oct 24, 2024

Git alias commands for faster easier version control

Shell 2,491 335 Updated Sep 21, 2024

veRL: Volcano Engine Reinforcement Learning for LLM

Python 1,208 89 Updated Jan 29, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,353 472 Updated Jan 27, 2025

一种任务级GPU算力分时调度的高性能深度学习训练平台

Python 389 52 Updated Oct 24, 2023

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,416 4,215 Updated Jan 29, 2025

LLM training in simple, raw C/CUDA

Cuda 25,173 2,886 Updated Oct 2, 2024

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 2,159 229 Updated Jan 27, 2025

Sample codes for my CUDA programming book

Cuda 1,626 333 Updated Jul 27, 2023

Herald: Accelerating Neural Recommendation Training with Embedding Scheduling (NSDI 2024)

Python 20 2 Updated May 9, 2024

A Lightweight Recommendation System

Python 8,463 646 Updated Nov 8, 2023

【PyTorch】Easy-to-use,Modular and Extendible package of deep-learning based CTR models.

Python 3,091 714 Updated Jul 2, 2024

Fast and memory-efficient exact attention

Python 15,221 1,439 Updated Jan 18, 2025

Make JSON greppable!

Go 13,967 329 Updated Nov 29, 2024

Tile primitives for speedy kernels

Cuda 1,969 102 Updated Jan 29, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 38,701 5,083 Updated Jan 29, 2025

[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)

Jupyter Notebook 2,181 274 Updated Jan 16, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,445 145 Updated Jan 24, 2025

Efficient Triton Kernels for LLM Training

Python 4,256 248 Updated Jan 29, 2025

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Rust 31,609 2,060 Updated Jan 29, 2025

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C++ 2,950 645 Updated Jan 29, 2025

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,412 523 Updated Jan 29, 2025

PyTorch package for the discrete VAE used for DALL·E.

Python 10,824 1,936 Updated Jan 31, 2024

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

27,635 3,721 Updated Jul 18, 2024

A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...

Rust 27,786 2,546 Updated Jan 28, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,473 307 Updated Oct 19, 2024

A Cloud Native Batch System (Project under CNCF)

Go 4,376 1,008 Updated Jan 24, 2025

Open source platform for the machine learning lifecycle

Python 19,336 4,326 Updated Jan 29, 2025

An implementation of a deep learning recommendation model (DLRM)

Python 3,820 848 Updated Oct 11, 2024
Next