This repository provides a set of ROS 2 packages to integrate whisper.cpp into ROS 2 using audio_common 4.0.3. Besides, silero-vad is used to perform VAD (Voice Activity Detection).
ROS 2 Distro | Branch | Build status | Docker Image | Documentation |
---|---|---|---|---|
Humble | main |
|||
Iron | main |
|||
Jazzy | main |
|||
Rolling | main |
- chatbot_ros → This chatbot, integrated into ROS 2, uses whisper_ros, to listen to people speech; and llama_ros, to generate responses. The chatbot is controlled by a state machine created with YASMIN.
To run whisper_ros with CUDA, first, you must install the CUDA Toolkit.
It is necessary to install LM-Studio and set up a model to run. If this is not done, the model will not respond.
After downloading, go to the "Developer" section (identified in green and located on the right-hand sidebar). Run a model —I suggest Llama— and enable the status slider to "Running." After this, you can proceed.
mkdir ~/ros2_ws/src
cd ~/ros2_ws/src
git clone https://github.com/mgonzs13/audio_common.git
git clone [email protected]:socialdroids/whisper_ros.git
pip3 install -r whisper_ros/requirements.txt
cd ~/ros2_ws
rosdep install --from-paths src --ignore-src -r -y
colcon build --cmake-args -DGGML_CUDA=ON # add this for CUDA
Run Silero for VAD and Whisper for STT:
ros2 launch whisper_bringup whisper.launch.py
Try the example of a whisper client:
ros2 run whisper_demos whisper_demo_node