K8s Inferencing

(attempting to) Deploy a pre-trained model to k8s for inferencing. Note, that this will require downloading models which can range from 500MB to 100GB.

Usage

Local run:

# Setup virtualenv
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

# Run
source .env
python3 app.py

Kubernetes run:

docker buildx build --platform linux/amd64,linux/arm64 -t thomasvn/python-inference . --push
docker push thomasvn/python-inference

source .env
envsubst < k8s.yaml | kubectl apply -f -

Model Types

openai-community/gpt2. ~525MB. 137M Params.
mistralai/Mixtral-8x7B-Instruct-v0.1. ~87GB. 46.7B Params.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
k8s.yaml		k8s.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K8s Inferencing

Usage

Model Types

About

Releases

Packages

Languages

thomasvn/k8s-inferencing

Folders and files

Latest commit

History

Repository files navigation

K8s Inferencing

Usage

Model Types

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages