Skip to content
View zhoushuang66's full-sized avatar

Block or report zhoushuang66

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 42,671 6,469 Updated Mar 26, 2025

Enjoy the magic of Diffusion models!

Python 8,098 726 Updated Mar 25, 2025

No fortress, purely open ground. OpenManus is Coming.

Python 39,821 6,611 Updated Mar 25, 2025

Building DeepSeek R1 from Scratch

Jupyter Notebook 495 72 Updated Mar 21, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,663 282 Updated Mar 10, 2025

Muon optimizer: +>30% sample efficiency with <3% wallclock overhead

Python 523 28 Updated Mar 25, 2025

Fully open reproduction of DeepSeek-R1

Python 23,306 2,120 Updated Mar 24, 2025

Solve Visual Understanding with Reinforced VLMs

Python 4,336 266 Updated Mar 24, 2025

pytorch distribute tutorials

Jupyter Notebook 117 25 Updated Feb 23, 2025

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,464 2,751 Updated Mar 26, 2025

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 15,705 1,828 Updated Mar 2, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 13,000 1,871 Updated Mar 1, 2025

Best practice for training LLaMA models in Megatron-LM

Python 645 56 Updated Jan 2, 2024

Stable-Hair: Real-World Hair Transfer via Diffusion Model (AAAI 2025)

Python 443 40 Updated Mar 14, 2025

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 55,035 11,916 Updated Mar 17, 2025

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 4,141 404 Updated Mar 21, 2025

Transformer seq2seq model, program that can build a language translator from parallel corpus

Python 1,385 349 Updated May 19, 2023

Transformer: PyTorch Implementation of "Attention Is All You Need"

Python 3,516 502 Updated Aug 6, 2024

The hardware design for AgiBot X1.

888 277 Updated Mar 14, 2025

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 7,589 831 Updated Mar 18, 2025

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Python 955 141 Updated Mar 25, 2025

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 37,617 4,315 Updated Mar 25, 2025

Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Jupyter Notebook 4,011 334 Updated Jan 13, 2025

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,769 512 Updated Mar 17, 2025

Ongoing research training transformer models at scale

Python 11,883 2,664 Updated Mar 25, 2025

Inference code for Llama models

Python 57,933 9,717 Updated Jan 26, 2025

主要记录大语言大模型(LLMs) 算法(应用)工程师相关的知识及面试题

HTML 6,404 735 Updated Oct 22, 2024
Python 17 2 Updated Jan 17, 2025
Python 52 2 Updated Aug 27, 2024
Next