The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
-
Updated
Dec 27, 2024 - Python
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibase’s LoRAX framework inference server.
Add a description, image, and links to the model-inference-service topic page so that developers can more easily learn about it.
To associate your repository with the model-inference-service topic, visit your repo's landing page and select "manage topics."