| Documentation | Intel® Gaudi® Documentation | Optimizing Training Platform Guide |
Latest News 🔥
- [2025/06] We are introduced an early developer preview of the vLLM Gaudi Plugin and is not yet intended for general use. For a more stable experience, consider using the HabanaAI/vllm-fork or the in-tree Gaudi implementation available in vllm-project/vllm.
vLLM Gaudi plugin (vllm-gaudi) integrates Intel Gaudi accelerators with vLLM to optimize large language model inference.
This plugin follows the [RFC]: Hardware pluggable and [RFC]: Enhancing vLLM Plugin Architecture principles, providing a modular interface for Intel Gaudi hardware.
Learn more: 🚀 vLLM Plugin System Overview
-
Preparation of the Setup
To set up the execution environment, please follow the instructions in the Gaudi Installation Guide. To achieve the best performance on HPU, please follow the methods outlined in the Optimizing Training Platform Guide.
-
Get Last good commit on vllm NOTE: vllm-gaudi is always follow latest vllm commit, however, vllm upstream API update may crash vllm-gaudi, this commit saved is verified with vllm-gaudi in a hourly basis
git clone https://github.com/vllm-project/vllm-gaudi cd vllm-gaudi export VLLM_COMMIT_HASH=$(git show "origin/vllm/last-good-commit-for-vllm-gaudi:VLLM_STABLE_COMMIT" 2>/dev/null)
-
Install vLLM with
pip
or from source:# Build vLLM from source for empty platform, reusing existing torch installation git clone https://github.com/vllm-project/vllm cd vllm git checkout $VLLM_COMMIT_HASH pip install -r <(sed '/^[torch]/d' requirements/build.txt) VLLM_TARGET_DEVICE=empty pip install --no-build-isolation -e . cd ..
-
Install vLLM-Gaudi from source:
cd vllm-gaudi pip install -e . cd ..
-
To uncover all installation methods, sucha as NixL, follow the link
We welcome and value any contributions and collaborations.
- For technical questions and feature requests, please use GitHub Issues
- For discussing with fellow users, please use the vLLM Forum
- For coordinating contributions and development, please use Slack
- For security disclosures, please use GitHub's Security Advisories feature