Skip to content
View yjw258's full-sized avatar
  • University of Science and Technology of China
  • Hefei, Anhui

Highlights

  • Pro

Block or report yjw258

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

CUDA/Metal accelerated language model inference

C 490 21 Updated Dec 18, 2024

Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O

C++ 211 18 Updated Jan 15, 2025

Awesome-LLM: a curated list of Large Language Model

20,654 1,686 Updated Jan 13, 2025

[COLM 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding

Python 235 15 Updated Aug 31, 2024

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

Python 145 6 Updated Oct 30, 2024

a lightweight LLM model inference framework

C++ 712 86 Updated Apr 7, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,161 1,070 Updated Jan 16, 2025

Large Language Model (LLM) Systems Paper List

732 26 Updated Jan 13, 2025

A throughput-oriented high-performance serving framework for LLMs

Cuda 695 29 Updated Sep 21, 2024

📰 Must-read papers and blogs on Speculative Decoding ⚡️

560 26 Updated Jan 16, 2025

Inference code for Llama models

Python 57,232 9,658 Updated Aug 18, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 7,371 709 Updated Jan 17, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 832 47 Updated Nov 14, 2024

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,374 137 Updated Jan 17, 2025

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,378 166 Updated Jun 25, 2024

LLM inference in C/C++

C++ 70,839 10,244 Updated Jan 17, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 33,901 5,198 Updated Jan 17, 2025

校招、秋招、春招、实习好项目!带你从零实现一个高性能的深度学习推理库,支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step

C++ 2,667 302 Updated Oct 26, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 36,262 4,199 Updated Jan 16, 2025

【Go 从入门到实战】学习笔记,从零开始学 Go、Gin 框架,基本语法包括 26 个Demo,Gin 框架包括:Gin 自定义路由配置、Gin 使用 Logrus 进行日志记录、Gin 数据绑定和验证、Gin 自定义错误处理、Go gRPC Hello World... 持续更新中...

Go 4,456 1,212 Updated Apr 7, 2024

gin+websocket+mongodb实现 IM 即时聊天系统,基于WS连接的即时聊天,支持单聊,在线回复以及历史记录查询

Go 129 39 Updated Jun 20, 2022

Package gorilla/websocket is a fast, well-tested and widely used WebSocket implementation for Go.

Go 22,936 3,511 Updated Aug 18, 2024

goim server write by golang !🚀

Go 2,797 469 Updated Mar 24, 2024

纯go实现的分布式im即时通讯系统,各层可单独部署,之间通过rpc通讯

Go 589 141 Updated Nov 23, 2019

golang基于websocket单台机器支持百万连接分布式聊天(IM)系统

Go 2,908 617 Updated Dec 16, 2024

IM Chat ChatGPT

Go 14,283 2,517 Updated Jan 17, 2025

记录实习、秋招中大型公司的Go语言面经,包括字节跳动,腾讯,滴滴,百度等等...内容涉及Go的基础语法以及底层,数据结构与算法,操作系统,数据库,计算机网络,计算机组成原理等等.....

64 3 Updated Oct 7, 2022

📚 C/C++ 技术面试基础知识总结,包括语言、程序库、数据结构、算法、系统、网络、链接装载库等知识及面试经验、招聘、内推等信息。This repository is a summary of the basic knowledge of recruiting job seekers and beginners in the direction of C/C++ technology, in…

C++ 35,229 8,013 Updated Mar 19, 2024

很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 7,872 1,001 Updated Jan 17, 2025

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems

Go 69,016 18,676 Updated Jan 17, 2025
Next