- CUDA-enabled GPU
- Python 3.8+
-
Clone the repository:
git clone https://github.com/WoSea/bloomz-1b1-api.git cd bloomz-1b1-api
-
Build the Docker image:
docker build -t bloomz-1b1-api .
-
Run the Docker container:
docker run --gpus all -p 8000:8000 bloomz-1b1-api
-
Interact with the API:
- Generate text:
curl -X POST "http://localhost:8000/llm" -H "Content-Type: application/json" -d '{"prompt": "Translate to English: Je t’aime."}'
- Check API health:
curl http://localhost:8000/alive
api_url/docs