Stars
SGLang is a fast serving framework for large language models and vision language models.
An extension library of tensorflow to accelerate industrial recommendation system model training
An industrial extension library of pytorch to accelerate large scale model training
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
DLRover: An Automatic Distributed Deep Learning System
Ring attention implementation with flash attention
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.
Robust recipes to align language models with human and AI preferences
High-speed download of LLaMA, Facebook's 65B parameter GPT model
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型
High Accuracy and efficiency multi-task fine-tuning framework for Code LLMs. This work has been accepted by KDD 2024.
Must-read papers on prompt-based tuning for pre-trained language models.
Ongoing research training transformer models at scale
A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples
Kubernetes-native Deep Learning Framework
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
PyTorch extensions for high performance and large scale training.
Run your deep learning workloads on Kubernetes more easily and efficiently.
Unified Interface for Constructing and Managing Workflows on different workflow engines, such as Argo Workflows, Tekton Pipelines, and Apache Airflow.