virtualenv -p python3.8 bentoml_env
pip install -r requirements.txt
python download_models.py
chmod 776 yolov8n-seg.pt
chmod 776 yolov8n.pt
BENTOML_CONFIG=configuration.yml bentoml serve service.py:svc -p 8995 --development --reload --debug
- How to allocate gpu: 1 with it?
bentoml build
bentoml bentoml containerize ml_pipeline_service:latest
docker run --rm --gpus '"device=1"' -v ./configuration.yml:/home/bentoml/configuration.yml -v $(pwd):/home/bentoml/bento/src/ -p 8995:3000 ml_pipeline_service:(your_tag)
-
Memory issues arise when executing within a container, witnessing a cumulative increase of 500MB with each inference attempt.
-
While performing a 30-batch operation, an error is encountered, surpassing the 60-second time limit.(can't reproduce in this repo because of different data and pre/post processing)
-
Running with
BENTOML_CONFIG=configuration.yml bentoml serve service.py:svc -p 8995 --development --reload --debug
consumes significantly less memory than execution within a Docker container. Approximately 3GB for serve of memory is utilized, and for docker 6-8GB, which escalates further with increased batch sizes -
A RuntimeError is experienced with a batch size of 30, displaying the message: "Unexpected ASGI message 'http.response.start' sent after the response has already been completed." Despite successful functionality and the desired response, this error persists. (can't reproduce in this repo because of different data and pre/post processing)
-
Swagger functionality is impaired when using the Docker container, but it works seamlessly with
bentoml serve
. Investigating the reason for this disparity. -
What benefits I get while using custom runner with transformers lib and bentoml.transformers?
-
Can't send defaultdict as input, but send as output is fine.