-
University of Science and Technology of China
- Hefei, Anhui
Highlights
- Pro
Starred repositories
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
Awesome-LLM: a curated list of Large Language Model
[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
Large Language Model (LLM) Systems Paper List
A throughput-oriented high-performance serving framework for LLMs
📰 Must-read papers and blogs on Speculative Decoding ⚡️
SGLang is a fast serving framework for large language models and vision language models.
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads
A high-throughput and memory-efficient inference and serving engine for LLMs
校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
【Go 从入门到实战】学习笔记,从零开始学 Go、Gin 框架,基本语法包括 26 个Demo,Gin 框架包括:Gin 自定义路由配置、Gin 使用 Logrus 进行日志记录、Gin 数据绑定和验证、Gin 自定义错误处理、Go gRPC Hello World... 持续更新中...
gin+websocket+mongodb实现 IM 即时聊天系统,基于WS连接的即时聊天,支持单聊,在线回复以及历史记录查询
Package gorilla/websocket is a fast, well-tested and widely used WebSocket implementation for Go.
记录实习、秋招中大型公司的Go语言面经,包括字节跳动,腾讯,滴滴,百度等等...内容涉及Go的基础语法以及底层,数据结构与算法,操作系统,数据库,计算机网络,计算机组成原理等等.....
📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, in…
很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。
The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems