Skip to content

thomasvn/k8s-inferencing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

K8s Inferencing

(attempting to) Deploy a pre-trained model to k8s for inferencing. Note, that this will require downloading models which can range from 500MB to 100GB.

Usage

Local run:

# Setup virtualenv
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run
source .env
python3 app.py

Kubernetes run:

docker buildx build --platform linux/amd64,linux/arm64 -t thomasvn/python-inference . --push
docker push thomasvn/python-inference

source .env
envsubst < k8s.yaml | kubectl apply -f -

Model Types

About

LLMs on k8s

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published