Starred repositories
A lightweight library for portable low-level GPU computation using WebGPU.
Ongoing research training transformer models at scale
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
CV-CUDA™ is an open-source, GPU accelerated library for cloud-scale image processing and computer vision.
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…
A native gRPC client & server implementation with async/await support.
Large Language Model Text Generation Inference
Bring projects, wikis, and teams together with AI. AppFlowy is an AI collaborative workspace where you achieve more without losing control of your data. The best open source alternative to Notion.
A high-throughput and memory-efficient inference and serving engine for LLMs
Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitti…
LlamaIndex is a data framework for your LLM applications
An industrial-grade C++ implementation of RAFT consensus algorithm based on brpc, widely used inside Baidu to build highly-available distributed systems.
Distributed vector search for AI-native applications
Making large AI models cheaper, faster and more accessible
Create extremely-fast and secure embedded HTTP servers with ease.
High-Resolution Image Synthesis with Latent Diffusion Models
High-Resolution Image Synthesis with Latent Diffusion Models
Stable Diffusion and Flux in pure C/C++
Example models using DeepSpeed
Minimalistic large language model 3D-parallelism training
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
A framework for large scale recommendation algorithms.
NVIDIA Linux open GPU kernel module source
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
A library for efficient similarity search and clustering of dense vectors.
🦜🔗 Build context-aware reasoning applications