Pinned Loading
Repositories
- onnxruntime-genai Public Forked from microsoft/onnxruntime-genai
Generative AI extensions for onnxruntime
Interactions-AI/onnxruntime-genai’s past year of commit activity - silero-vad Public Forked from snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Interactions-AI/silero-vad’s past year of commit activity - onnxruntime Public Forked from microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Interactions-AI/onnxruntime’s past year of commit activity - TensorRT-LLM Public Forked from NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
Interactions-AI/TensorRT-LLM’s past year of commit activity - spiritlm Public Forked from facebookresearch/spiritlm
Inference code for the paper "Spirit-LM Interleaved Spoken and Written Language Model".
Interactions-AI/spiritlm’s past year of commit activity - triton-client Public Forked from triton-inference-server/client
Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
Interactions-AI/triton-client’s past year of commit activity - triton-server Public Forked from triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Interactions-AI/triton-server’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…