Using the OpenAI Realtime API can incur significant costs. I strongly advise against using personal savings to try this tool, as the expenses may and will outweigh the benefits. Proceed with caution and monitor your usage.
OpenAI Advanced Voice Python Assistant is a Python-based assistant that leverages OpenAI's Realtime API to facilitate real-time audio interactions. Designed for personal experimentation, this assistant supports live audio conversations, transcriptions of both user and assistant speech, and integrates tool calling capabilities for extended functionality. The project is inspired by and based on the openai-realtime-py repository. Tested on Win11, aims to be compatible with Linux.
- Real-time Voice Interaction: Live audio conversations with the assistant.
- Speech Transcription: Transcription of both user and assistant speech.
- Tool Calling: Invoke predefined tools during conversations.
- Interruptions Support: Interrupt the assistant while it is speaking.
- Debug Logging: Optionally log all incoming messages for debugging purposes.
- Python 3.7 or higher
- Virtual Environment (recommended)
-
Clone the Repository
git clone https://github.com/your-username/openai-advanced-voice-python.git cd openai-advanced-voice-python
-
Create a Virtual Environment (optional)
-
On Windows:
python -m venv venv venv\Scripts\activate
-
On Linux:
python3 -m venv venv source venv/bin/activate
-
-
Install Dependencies
pip install -r requirements.txt
-
Environment Variables
-
Copy
.env.example
to.env
:cp .env.example .env
-
Open the
.env
file and configure the following variables:OPENAI_API_KEY=your-openai-key
OPENAI_API_KEY
: Your OpenAI API key.
-
-
Configuration File
-
Copy
config.yaml.example
toconfig.yaml
:cp config.yaml.example config.yaml
-
Open the
config.yaml
file and adjust the settings as you see fit.
-
-
Activate Virtual Environment (optional)
-
On Windows:
venv\Scripts\activate
-
On Linux:
source venv/bin/activate
-
-
Run the Assistant
python realtime_assistant.py
-
Interact
- Speak into your microphone to communicate with the assistant.
- The assistant will respond in real-time with audio and text transcriptions.
The assistant supports tool calling, allowing it to perform specific functions during the conversation. Available tools include:
-
write_to_console
- Description: Writes a specified message to the console.
- Parameters:
message
(string) – The message to write.
-
save_to_file
- Description: Saves user-specified content to a file.
- Parameters:
file_name
(string): Name of the file without extension.file_extension
(string): File extension (e.g., txt, md).file_content
(string): Content to write into the file.
Additional tools can be added by extending the TOOLS
list in the configuration.
If debug_mode
is enabled in the config.yaml
file, all incoming messages (excluding certain binary types) will be logged to incoming_messages.log
. This is useful for debugging and monitoring the assistant's interactions.
By enabling allow_interruptions
in the config.yaml
file, the assistant allows users to interrupt its speech by keeping the microphone active.
OpenAI Advanced Voice Python Assistant is intended for personal use and experimentation. It is not production-ready and may not handle all edge cases correctly. Use at your own risk.
Additionally, using the OpenAI Realtime API can be costly. Be aware of rapid credit spending and monitor your usage to avoid unexpected charges.
This project is licensed under the MIT License.