- Beijing
- https://lucky521.github.io
Starred repositories
A list of awesome papers and resources of recommender system on large language model (LLM).
Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Large Language Model Text Generation Inference
A retargetable MLIR-based machine learning compiler and runtime toolkit.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
🦜🔗 Build context-aware reasoning applications
Robust Speech Recognition via Large-Scale Weak Supervision
Fast and memory-efficient exact attention
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
C++ implementation of ChatGLM-6B & ChatGLM2-6B & ChatGLM3 & GLM4(V)
A high-throughput and memory-efficient inference and serving engine for LLMs
纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Port of OpenAI's Whisper model in C/C++
Source code for Twitter's Recommendation Algorithm
Source code for Twitter's Recommendation Algorithm
The core of our monitoring platform with a powerful configuration language and REST API.
Netty project - an event-driven asynchronous network application framework
Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Feathr – A scalable, unified data and AI engineering platform for enterprise
OpenMLDB is an open-source machine learning database that provides a feature platform computing consistent features for training and inference.
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Deep learning with dynamic computation graphs in TensorFlow