Skip to content

[Agent] ElevenLabs TTS tool #263

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Guikingone
Copy link
Contributor

Q A
Bug fix? no
New feature? yes
Docs? yes
Issues None
License MIT

Hi 👋🏻

This PR aims to introduce a new native tool to allows the TTS pipeline from the agent, the HTTP call is performed against ElevenLabs API (only for TTS currently).

Once received, the file is stored using the $path property, the unit test is performed against the fixture file (not sure about this one but it works).

@carsonbot carsonbot added Agent Issues & PRs about the AI Agent component Feature New feature Status: Needs Review labels Aug 5, 2025
@Guikingone Guikingone force-pushed the agent/eleven_labs_tool branch 2 times, most recently from e65c9fc to 6bd71bd Compare August 5, 2025 17:29
@OskarStark OskarStark requested a review from Copilot August 5, 2025 17:32
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new native tool for ElevenLabs Text-to-Speech (TTS) functionality within the Symfony AI Agent package. The tool enables agents to convert text to speech by making HTTP calls to the ElevenLabs API and storing the generated audio files locally.

Key changes:

  • Adds ElevenLabs TTS tool with configurable voice, model, and output path
  • Includes comprehensive test coverage using mock HTTP client
  • Provides example usage and documentation updates

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/agent/src/Toolbox/Tool/ElevenLabs.php Core TTS tool implementation with ElevenLabs API integration
src/agent/tests/Toolbox/Tool/ElevenLabsTest.php Unit tests for the ElevenLabs tool functionality
src/agent/composer.json Adds symfony/filesystem dependency for file operations
examples/misc/text-to-speech.php Working example demonstrating tool usage
src/agent/doc/index.rst Documentation updates referencing the new example
examples/.env Environment variable for ElevenLabs API key
Comments suppressed due to low confidence (1)

src/agent/tests/Toolbox/Tool/ElevenLabsTest.php:38

  • The test doesn't verify that the file was actually created or that the 'path' key contains the expected file path. Consider adding assertions to validate the file creation and path structure.
        $this->assertCount(2, $result);

@Guikingone Guikingone force-pushed the agent/eleven_labs_tool branch from b57a954 to ffaa356 Compare August 5, 2025 19:01
@Guikingone Guikingone requested a review from OskarStark August 5, 2025 19:03
@Guikingone Guikingone force-pushed the agent/eleven_labs_tool branch from ffaa356 to 5cb747c Compare August 5, 2025 20:07
@Guikingone Guikingone force-pushed the agent/eleven_labs_tool branch from 5cb747c to ef185a8 Compare August 7, 2025 08:55
Copy link
Member

@chr-hertel chr-hertel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ElevenLabs 🥳

Great stuff - thanks for having a look at that platform! So there are multiple things in here - and we should split it:

instead of focussing on the tool, let's bring in a Platform bridge first please. basically a Symfony\AI\Platform\Bridge\ElevenLabs and text input, binary result

you're twisting my mind here, but you're right - we should out of the box enable agents to use platforms as a tool as well - not only agents.
and maybe even 3, for example in the demo project there is some kind of pre processing with whisper (speech2text) and this could be also post processing text2speech - this should be standardized somehow ...

a lot of stuff in those 165 lines of code 😆

are you up for starting with the platform bridge?

@Guikingone
Copy link
Contributor Author

are you up for starting with the platform bridge?

Yes, of course.

I'll take a look at the pre-processing part 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agent Issues & PRs about the AI Agent component Feature New feature Status: Needs Work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants