We provide scripts for building your own gradio server to run VILA models with TinyChat. Please run the following commands to launch the server.
python -m tinychat.serve.controller --host 0.0.0.0 --port 10000
python -m tinychat.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload --share --auto-pad-image-token
After launching this script, the web interface will be served on your machine and you can access it with a public URL (or localhost URL).
python -m tinychat.serve.model_worker_new --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path <path-to-fp16-hf-model> --quant-path <path-to-awq-checkpoint>
# Please change tinychat.serve.model_worker_new to tinychat.serve.model_worker if you want to serve VILA rather than VILA-1.5
Note: You can launch multiple model workers onto the same web server. And please remember to specify different ports for each model worker.
This demo is inspired by LLaVA. We thank LLaVA for providing an elegant way to build the Gradio Web UI.