kaimo455

🚩

Stay Calm AND Carry On

Kevin kaimo455

🚩

Stay Calm AND Carry On

Previously focused on Bayesian Optimization(BO), Reinforcement Learning(RL). Now moving toward Large Language Model(LLM), Mixture of Expert(MoE) etc.

21 followers · 62 following

Shenzhen
05:35 (UTC +08:00)

Achievements

Starred repositories

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 4,506 418 Updated Mar 8, 2025

dvanoni / notero

A Zotero plugin for syncing items and notes into Notion

TypeScript 2,614 111 Updated Mar 1, 2025

NVIDIA / TensorRT-Model-Optimizer

A unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deploym…

Python 770 56 Updated Mar 3, 2025

NVIDIA / apex

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Python 8,575 1,434 Updated Feb 26, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

898 116 Updated Mar 3, 2025

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,040 151 Updated Feb 27, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,541 246 Updated Mar 5, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,249 377 Updated Mar 9, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,853 474 Updated Mar 8, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 11,679 2,621 Updated Mar 8, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,092 611 Updated Mar 6, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,696 195 Updated Mar 4, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 22,416 2,011 Updated Mar 9, 2025

withinmiaov / A-Survey-on-Mixture-of-Experts-in-LLMs

The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".

274 18 Updated Jan 21, 2025

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 13,281 2,723 Updated Mar 9, 2025

labmlai / annotated_deep_learning_paper_implementations

🧑‍🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…

Python 59,073 5,993 Updated Aug 24, 2024

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,462 898 Updated Jul 1, 2024

llm-random / llm-random

Python 181 16 Updated Mar 9, 2025

deepseek-ai / DeepSeek-MoE

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Python 1,565 271 Updated Jan 16, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 23,513 2,328 Updated Mar 7, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,861 442 Updated Jan 12, 2025

IST-DASLab / sparsegpt

Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

Python 773 102 Updated Aug 20, 2024

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,051 162 Updated Mar 27, 2024

kvcache-ai / Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,771 169 Updated Mar 7, 2025

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 5,429 532 Updated Mar 9, 2025

zeux / calm

CUDA/Metal accelerated language model inference

C 522 23 Updated Dec 18, 2024

OpenCoder-llm / OpenCoder-llm

The Open Cookbook for Top-Tier Code Large Language Model

Python 1,633 99 Updated Dec 8, 2024

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,857 6,150 Updated Mar 9, 2025

OpenBMB / BMPrinciples

A collection of phenomenons observed during the scaling of big foundation models, which may be developed into consensus, principles, or laws in the future

277 19 Updated Aug 13, 2023

OpenLMLab / MOSS-RLHF

Secrets of RLHF in Large Language Models Part I: PPO

Python 1,325 97 Updated Mar 3, 2024

Kevin kaimo455

Starred repositories

macOS

Docker

Tensorflow