This Node.js application uses the Eleven Labs API to synthesize text to speech. It takes in text and voice settings as input and returns the synthesized audio in a data URI format you can use in an Audio step in your Assistant.
Loom video: https://www.loom.com/share/a9a0abf118594ba590386024c905596f
- Node.js installed on your machine
-
Clone the repository:
git clone https://github.com/voiceflow-gallagan/VF-ElevenLabs.git
-
Change to the project directory:
cd VF-ElevenLabs
-
Install the required dependencies:
npm install
-
Replace
your_api_key_here
in the.env.template
file with your Eleven Labs API key.ELEVENLABS_API_KEY=your_api_key_here PORT=3000
-
Rename the
.env.template
file to.env
.
Run the following command in the project directory:
npm start
The server will start listening on the specified port (default is 3000). You can now make a POST request to the /synthesize
endpoint with the text and voice settings as input.
POST /synthesize
Request Body:
- text(required): The text to synthesize.
- voice (optional): The voice to use for synthesis. Default is '21m00Tcm4TlvDq8ikWAM'.
- voice_settings (optional): An object containing additional voice settings.
- stability (default: 0)
- similarity_boost (default: 0)
Response:
Returns an object containing the synthesized audio in a data URI format.
{
"audioDataURI": "data:audio/mpeg;base64,..."
}
curl -X POST -H "Content-Type: application/json" \
-d '{"text": "Hello, world!", "voice": "21m00Tcm4TlvDq8ikWAM", "voice_settings": {"stability": 0, "similarity_boost": 0}}' \
http://localhost:3000/synthesize
{
"audioDataURI": "data:audio/mpeg;base64,//uQZAAAAAAAAAAAAAAAAAAAAA"
}