Skip to content
View whybeyoung's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report whybeyoung

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,068 339 Updated Mar 5, 2025

Expert Parallelism Load Balancer

Python 1,050 152 Updated Feb 27, 2025

Analyze computation-communication overlap in V3/R1.

906 115 Updated Mar 3, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,557 251 Updated Mar 10, 2025

My learning notes/codes for ML SYS.

Python 1,353 69 Updated Mar 10, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 4,880 479 Updated Mar 10, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,108 614 Updated Mar 11, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,238 785 Updated Mar 1, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,716 198 Updated Mar 4, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 11,681 1,192 Updated Mar 11, 2025

🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.

Python 3,117 159 Updated Mar 7, 2025

A collection of community maintained NRI plugins

Go 74 25 Updated Mar 6, 2025

OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go

Go 1 Updated Dec 3, 2024

SciLifeLab Serve is a platform offering machine learning model serving, data science app hosting (Shiny, Gradio, Streamlit, Dash, etc.), and other tools to life science researchers affiliated with …

JavaScript 8 1 Updated Mar 10, 2025

Examples of models deployable with Truss

Python 164 40 Updated Mar 7, 2025

BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation …

Python 135 31 Updated Jun 19, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 41,006 6,180 Updated Mar 11, 2025

Llama中文社区,Llama3在线体验和微调模型已开放,实时汇总最新Llama3学习资料,已将所有代码更新适配Llama3,构建最好的中文Llama大模型,完全开源可商用

Python 14,467 1,293 Updated Sep 5, 2024

Free ChatGPT API Key,免费ChatGPT API,支持GPT4 API(免费),ChatGPT国内可用免费转发API,直连无需代理。可以搭配ChatBox等软件/插件使用,极大降低接口使用成本。国内即可无限制畅快聊天。

Python 28,108 2,060 Updated Feb 14, 2025

Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source)

Go 21,210 766 Updated Mar 10, 2025

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

Python 27,865 2,144 Updated Mar 7, 2025

本项目旨在收集开源的表格智能任务数据集(比如表格问答、表格-文本生成等),将原始数据整理为指令微调格式的数据并微调LLM,进而增强LLM对于表格数据的理解,最终构建出专门面向表格智能任务的大型语言模型。

540 41 Updated Apr 22, 2024

GPU-Jupyter: Your GPU-accelerated JupyterLab with a rich data science toolstack, TensorFlow and PyTorch for your reproducible deep learning experiments.

Jupyter Notebook 724 234 Updated Feb 28, 2025

This repository contains a Python implementation that allows you to use gorilla-llm/gorilla-openfunctions-v2 LLM to perform function calling using the OpenAI protocol. It provides a way to extend t…

Python 16 3 Updated Apr 7, 2024

SparkClient 是一个Go语言库,用于与讯飞星火Spark AI的聊天API进行交互。它封装了创建请求、处理响应和WebSocket通信的逻辑,使得在Go应用程序中集成Spark AI服务变得简单。

Go 2 Updated Feb 17, 2025

OpenAI 接口接入适配,支持千帆大模型平台、讯飞星火大模型、腾讯混元以及MiniMax、Deep-Seek,等兼容OpenAI接口,仅单可执行文件,配置超级简单,一键部署,开箱即用. Seamlessly integrate with OpenAI and compatible APIs using a single executable for quick setup and depl…

Go 2,079 163 Updated Feb 19, 2025

YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]

Python 10,466 1,047 Updated Sep 26, 2024

The first AI Agent Server, Eidolon is a pluggable Agent SDK and enterprise ready, deployment server for Agentic applications

Python 418 39 Updated Dec 19, 2024
Go 6 3 Updated May 9, 2024

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3

Python 1,887 165 Updated Sep 23, 2024
Next