Skip to content
View zhaohb's full-sized avatar

Block or report zhaohb

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Add genai backend for ollama to run generative AI models using OpenVINO Runtime.

Go 20 Updated Apr 10, 2025

mysteries about Hardware in software engineer's eyes

C++ 5 3 Updated Apr 22, 2025

This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.

TypeScript 5,441 627 Updated Mar 17, 2025

MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

Python 19,289 1,395 Updated Mar 3, 2025

Using OpenVINO to speed up MeloTTS inference

Python 10 3 Updated Nov 1, 2024
Python 3 Updated Oct 16, 2024

Using OpenVINO to speed up moondream2 inference

Python 4 Updated Oct 14, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 2 1 Updated Aug 6, 2024

🤗 Optimum Intel: Accelerate inference with Intel optimization tools

Jupyter Notebook 460 129 Updated Apr 23, 2025
C++ 7 2 Updated Sep 22, 2023
Python 86 9 Updated Feb 6, 2024
Python 608 56 Updated Jul 31, 2024
Python 90 10 Updated Jun 30, 2023

🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用

Rust 37,275 6,743 Updated Mar 25, 2025

llm deploy project based mnn. This project has merged into MNN.

C++ 1,574 172 Updated Jan 20, 2025

Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.

C++ 604 164 Updated Apr 14, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,497 2,182 Updated Mar 11, 2025
Python 192 57 Updated Mar 28, 2023

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

C++ 861 164 Updated Dec 30, 2024

OpenVINO™ is an open source toolkit for optimizing and deploying AI inference

C++ 8,171 2,580 Updated Apr 23, 2025

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

Go 6,188 717 Updated Apr 22, 2025

AutoKernel 是一个简单易用,低门槛的自动算子优化工具,提高深度学习算法部署效率。

C++ 737 94 Updated Sep 23, 2022

The Triton backend for TensorRT.

C++ 73 32 Updated Apr 23, 2025

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

C++ 1 Updated Oct 4, 2021

Build Container Images In Kubernetes

Go 15,438 1,461 Updated Apr 21, 2025

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 9,109 1,561 Updated Apr 23, 2025

OpenVINO backend for Triton.

C++ 31 17 Updated Apr 16, 2025

Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

C++ 845 340 Updated Apr 23, 2025
Next