Skip to content
View yifuwang's full-sized avatar

Block or report yifuwang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepEP: an efficient expert-parallel communication library

Cuda 7,280 669 Updated Mar 18, 2025

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,215 653 Updated Mar 19, 2025

NanoGPT (124M) in 3 minutes

Python 2,414 268 Updated Mar 18, 2025

Custom kernels in Triton language for accelerating LLMs

Python 18 Updated Apr 5, 2024

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Python 1,313 90 Updated Mar 22, 2025

Optimized primitives for collective multi-GPU communication

C++ 3,585 884 Updated Mar 16, 2025

CUDA Templates for Linear Algebra Subroutines

C++ 7,156 1,174 Updated Mar 21, 2025

Inference Llama 2 with a model compiled to native code by TorchInductor

C++ 14 4 Updated Feb 8, 2024

Inference Llama 2 in one file of pure C

C 18,204 2,218 Updated Aug 6, 2024

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,893 545 Updated Mar 13, 2025

Development repository for the Triton language and compiler

MLIR 14,954 1,880 Updated Mar 23, 2025

Distribute and run AI workloads magically in Python, like PyTorch for ML infra.

Python 1,014 38 Updated Mar 20, 2025

A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to facilitate metric computation in distributed training and tools…

Python 229 54 Updated Jan 17, 2025
Jupyter Notebook 162 27 Updated Jun 16, 2024

A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind.

Python 155 44 Updated Dec 6, 2024

Use Neovim as a language server to inject LSP diagnostics, code actions, and more via Lua.

Lua 3,637 777 Updated Oct 3, 2023

Pytorch domain library for recommendation systems

Python 2,064 488 Updated Mar 23, 2025

🌸 A command-line fuzzy finder

Go 68,806 2,481 Updated Mar 23, 2025

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.

Python 29,170 3,458 Updated Mar 21, 2025

❤️ Slim, Fast and Hackable Completion Framework for Neovim

Python 1,336 43 Updated Mar 17, 2022

PyTorch elastic training

Python 730 100 Updated Jun 15, 2022

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Python 14,430 2,254 Updated Feb 1, 2025

A toolkit for developing and comparing reinforcement learning algorithms.

Python 35,650 8,654 Updated Oct 11, 2024

Retro Games in Gym

C 3,470 532 Updated Feb 22, 2024

🌺 Minimalist Vim Plugin Manager

Vim Script 34,663 1,950 Updated Mar 12, 2025

🌠 Dark powered asynchronous completion framework for neovim/Vim8

Python 5,957 293 Updated Jun 5, 2024

This repository is outdated and new Boost Note app is available! We've launched a new Boost Note app which supports real-time collaborative writing. https://github.com/BoostIO/BoostNote-App

JavaScript 17,042 1,460 Updated Apr 19, 2023

C/C++ language server supporting multi-million line code base, powered by libclang. Emacs, Vim, VSCode, and others with language server protocol support. Cross references, completion, diagnostics, …

C++ 2,348 163 Updated Jul 29, 2020
Next