forked from mit-han-lab/llm-awq
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
26 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
## Gradio demo: VILA with TinyChat | ||
|
||
We provide scripts for building your own gradio server to run VILA models with TinyChat. Please run the following commands to launch the server. | ||
|
||
#### Launch a controller | ||
```bash | ||
python -m tinychat.serve.controller --host 0.0.0.0 --port 10000 | ||
``` | ||
|
||
#### Launch gradio web server. | ||
```bash | ||
python -m tinychat.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload --share | ||
``` | ||
After launching this script, the web interface will be served on your machine and you can access it with a public URL (or localhost URL). | ||
|
||
#### Launch a model worker | ||
|
||
```bash | ||
python -m tinychat.serve.model_worker --host 0.0.0.0 --controller http://localhost:10000 --port 40000 --worker http://localhost:40000 --model-path <path-to-fp16-hf-model> --quant-path <path-to-awq-checkpoint> | ||
``` | ||
|
||
Note: You can launch multiple model workers onto the same web server. And please remember to specify different ports for each model worker. | ||
|
||
### Acknowlegement | ||
|
||
This demo is inspired by [LLaVA](https://github.com/haotian-liu/LLaVA). We thank LLaVA for providing an elegant way to build the Gradio Web UI. |