-
Follow the instructions here to either build the model using a Hugging Face URL, or a local directory. If opting for a local directory, you can follow the instructions here to get the original LLaMA weights in the HuggingFace format, and here to get Vicuna weights.
git clone https://github.com/mlc-ai/mlc-llm.git --recursive cd mlc-llm # From Hugging Face URL python3 build.py --hf-path databricks/dolly-v2-3b --quantization q3f16_0 --max-seq-len 768 # From local directory python3 build.py --model path/to/vicuna-v1-7b --quantization q3f16_0 --max-seq-len 768 # If the model path is in the form of `dist/models/model_name`, # we can simplify the build command to # python build.py --model model_name --quantization q3f16_0 --max-seq-len 768
-
Build the CLI.
# Compile and build cd build cmake .. make cd .. # Execute the CLI ./build/mlc_chat_cli --model vicuna-v1-7b
cpp
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||