AI-powered tool for seamless speech-to-text, translation, and speech synthesis. Built with modern NLP and text-to-speech technologies.
- Transcription: Converts audio to text using Whisper.
- Translation: Translates text to English.
- Text-to-Speech: Converts translated text to natural speech using ElevenLabs.
- Gradio Interface: Simple, intuitive UI.
-
Clone the Repo
git clone https://github.com/diegoruny/StS-translator cd StS-translator
-
Install Dependencies
python -m venv .venv .venv\Scripts\Activate pip install -r requirements.txt
-
Set Up API Key
- Create a
.env
file:
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
- Create a
-
Run the App
python app.py
-
Use the Interface
- Access the app at
http://127.0.0.1:7860/
. - Upload or record audio, receive translated speech.
- Access the app at
- Whisper for transcription.
- Translator for language conversion.
- ElevenLabs for text-to-speech.
- Gradio for the user interface.
This project was inspired by a YouTube based on a post and has been extended and customized to enhance user experience and functionality.
MIT License. See LICENSE
for details.