Voice-controlled car audio player, powered by Raspberry Pi, voice2json, and Mopidy. Also supports a plaintext listener, via MQTT.
Rather than using a wake-word (which has a delay, and might result in false positives from music), this project uses a Bluetooth media controller, in "walkie-talkie" mode (press to talk, release to execute).
- "Play something by <artist>"
- "Play some <genre>"
- "Play song <trackname>"
- "Play track <trackname> by <artist>"
- "Play album <album>"
- "Play album <album> by <artist>"
- "Play the <nth> album by <artist>"
- "Play the latest album by <artist>"
- "Start playlist <playlist>"
- "Shuffle playlist <playlist>"
- "Play <genre> from the year <year>"
- "Play an album from the <decade>'s"
See sentences.ini.ts
for the full list of grammars, in the simplified JSGF format.
voice2json will make educated guesses for any album, track, etc, that it doesn't know how to pronounce. You can train its pronunciations by adding a sounds_like.txt
to data
(refer to the example file, and the documentation).
cd ~
git clone https://github.com/lukifer/voicetunes
cd voicetunes
sudo ./setup.sh
To run at startup (and auto-restart on a fatal error), run sudo crontab -e
and add:
@reboot cd /home/pi/voicetunes/; ./ramdisk.sh; sudo npm run start
* * * * * cd /home/pi/voicetunes/; sudo npm run start
I use the Respeaker 4-Mic HAT, but any ALSA or Pulse mic input should work. If not using the Respeaker, set { "USE_LED": false }
in config.local.json
.
While the Pi's built-in audio works, its quality is thoroughly mediocre. I recommend a USB DAC; I use the Fiio K3, and I'm quite happy with it.
I use a Tunia Media Button, but this could hypothetically work with any Bluetooth LTE media controller. You may have to change the values for KEY_DOWN
, etc, in config.local.json
(you can experiment with bt.ts
to find the right values).
Alternately, you can set up an MQTT listener to receive voice (or text) commands over a network:
cd /home/pi/voicetunes
npm run mqtt
mosquitto_pub -h raspberrypi.local -t "text2json" -m "play something by nirvana"
Unless you have another plan for power management, I recommend the LiFePO4wered UPS. It can ensure a safe, smooth shutdown, and is highly configurable.
Not strictly needed, but micro SD's have a limited lifespan, and by moving all system-level logging to RAM, that lifespan can be extended. (voicetunes keeps its own log of all commands, at ~/voicetunes/log.txt
, which can be useful for debugging).
echo "deb http://packages.azlux.fr/debian/ buster main" | sudo tee /etc/apt/sources.list.d/azlux.list
wget -qO - https://azlux.fr/repo.gpg.key | sudo apt-key add -
apt update
apt install log2ram
This was built to export from a iTunes/Music.app library, and has a script to convert the iTunes XML to a SQLite database. You can also build this database yourself if your music is coming from a different source; refer to the schema and sample data in tests/testDb.sql
.
Music.app in Catalina / Big Sur no longer exports the Library XML automatically. You can hand-export from File -> Library -> Export Library
, or run the included AppleScript:
osascript exportlibrary.applescript
To sync music to your Pi (after this package has been installed and setup), run the following:
git clone https://github.com/lukifer/voicetunes.git
cd voicetunes
./build.sh [email protected]:/home/pi/voicetunes
rsync -az ~/Music/iTunes/iTunes\ Media/Music/ [email protected]:/home/pi/music/
- Command: "Clear queue"
- Command to identify current track
- Command: "Jump to track <N>"
- Command: "Play the <Nth> track from <album>"
- Command: "Play a new track by <artist>"
- Replace
exec
calls with Node sockets and pipes - Option to use mopidy local library and native M3U playlists
- Option to cache entire voice2json profile to RAM disk
- Option to cache entire SQLite db to RAM disk
- Option to restore previous state on startup
- Your suggestion here!