Starred repositories
Add genai backend for ollama to run generative AI models using OpenVINO Runtime.
This is a simple demonstration of more advanced, agentic patterns built on top of the Realtime API.
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
Using OpenVINO to speed up moondream2 inference
zhaohb / TTS-OV
Forked from coqui-ai/TTS🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
🤗 Optimum Intel: Accelerate inference with Intel optimization tools
🤱🏻 Turn any webpage into a desktop app with Rust. 🤱🏻 利用 Rust 轻松构建轻量级多端桌面应用
llm deploy project based mnn. This project has merged into MNN.
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.
OpenVINO™ is an open source toolkit for optimizing and deploying AI inference
Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.
The Triton backend for TensorRT.
zhaohb / jetson-inference
Forked from dusty-nv/jetson-inferenceHello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
Build Container Images In Kubernetes
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
OpenVINO backend for Triton.
Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure