Skip to content
View hiworldwzj's full-sized avatar

Block or report hiworldwzj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Applied AI experiments and examples for PyTorch

Python 193 18 Updated Dec 17, 2024

A throughput-oriented high-performance serving framework for LLMs

Cuda 672 27 Updated Sep 21, 2024

Github mirror of trition-lang/triton repo.

C++ 15 4 Updated Dec 23, 2024

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 56,416 8,343 Updated Dec 29, 2024

Mirage: Automatically Generating Fast GPU Kernels without Programming in Triton/CUDA

C++ 690 40 Updated Dec 26, 2024

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,711 217 Updated Dec 28, 2024

SGLang is a fast serving framework for large language models and vision language models.

Python 6,843 622 Updated Dec 29, 2024

Master programming by recreating your favorite technologies from scratch.

Markdown 319,491 29,633 Updated Sep 3, 2024

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Python 34,858 2,639 Updated Dec 29, 2024

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

Cuda 760 41 Updated Dec 28, 2024

很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 7,516 984 Updated Dec 28, 2024

Regular expression manipulation library

Python 346 41 Updated Jun 8, 2024

Allows to check regexes for overlaps. Based on greenery by @qntm.

Python 43 6 Updated Jun 5, 2024

An LR(1) parser generator and visualizer created for educational purposes.

Rust 93 5 Updated Nov 19, 2024

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

Python 5,001 420 Updated Nov 13, 2024

🔥🔥超过1000本的计算机经典书籍、个人笔记资料以及本人在各平台发表文章中所涉及的资源等。书籍资源包括C/C++、Java、Python、Go语言、数据结构与算法、操作系统、后端架构、计算机系统知识、数据库、计算机网络、设计模式、前端、汇编以及校招社招各种面经~

21,610 3,792 Updated Jul 2, 2024

Data validation using Python type hints

Python 21,810 1,944 Updated Dec 28, 2024

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 663 52 Updated Sep 4, 2024

Structured Text Generation

Python 10,171 531 Updated Dec 28, 2024

Python bindings for general-sam and some utilities

Python 3 Updated Dec 9, 2024

A general suffix automaton implementation in Rust with Python bindings

Rust 4 Updated Oct 18, 2024

Easiest and laziest way for building multi-agent LLMs applications.

Python 1,050 69 Updated Dec 27, 2024

🌟 Wiki of OI / ICPC for everyone. (某大型游戏线上攻略,内含炫酷算术魔法)

TypeScript 21,777 4,035 Updated Dec 28, 2024

Chat with your notes & see links to related content with AI embeddings. Use local models or 100+ via APIs like Claude, Gemini, ChatGPT & Llama 3

JavaScript 2,967 188 Updated Dec 27, 2024

[NeurIPS 2024 Spotlight] Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

Python 565 54 Updated Oct 28, 2024
Python 1 1 Updated Apr 17, 2024

lightllm加上pp优化

Python 1 Updated Feb 26, 2024

Build Conversational AI in minutes ⚡️

Python 7,513 1,003 Updated Dec 29, 2024

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Python 370 42 Updated Dec 23, 2024
Next