openai-kokoro-tts

Welcome to openai-kokoro-tts! This is a third-party application that provides an OpenAI API-compatible endpoint for generating high-quality text-to-speech (TTS) audio using Kokoro-TTS, a flexible and powerful TTS engine developed by hexgrad. This project can be used as a drop-in replacement for various OpenAI client applications, such as open-webui, enabling seamless integration of Kokoro-TTS into your existing workflows.

Prerequisites

Before getting started, ensure you have the following installed on your system:

Git: For cloning the repository.
Python 3.10 or newer: Required for development.
Docker: Optional but recommended for production deployment.

Deployment Instructions

Using Docker Compose (Recommended)

git clone https://github.com/matthewhand/openai-kokoro-tts
cd openai-kokoro-tts
docker-compose up --build -d

The API will be available at http://localhost:9090.

Development Setup with `uv`

uv is a modern tool for managing Python environments, dependencies, and project workflows. Follow these steps to set up your development environment with uv.

Installation

macOS and Linux

Run the following command to install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Alternatively, use Homebrew:

brew install uv

Windows

Run this command in PowerShell to install uv:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Environment Configuration

Copy .env.example to .env:
```
cp .env.example .env
```
Update the .env File:
- Set a secure value for API_KEY to protect your API endpoints:
```
API_KEY=your_secure_api_key_here
```
- Adjust other settings (e.g., PORT, MODEL_PATH) based on your requirements.

Setting Up Models

Ubuntu (Automated Setup)

Run the included setup_models.sh script:

bash setup_models.sh

This script will:

Verify and install required tools such as Git LFS and espeak-ng.
Clone the Kokoro-82M model repository into the models/kokoro directory.
Create the voices directory (if missing).

macOS and Windows (Manual Setup)

If you're not on Ubuntu, follow these steps to set up the models manually:

Install Git LFS:
- macOS:
```
brew install git-lfs
git lfs install
```
- Windows: Download and install Git LFS from Git LFS website.

Clone the Kokoro-82M Repository:

git clone https://huggingface.co/hexgrad/Kokoro-82M models/kokoro

Verify Directory Structure: Ensure the models/kokoro/kokoro-v0_19.pth file exists.
Install System Dependencies:
- macOS:
```
brew install espeak-ng
```
- Windows: Install espeak-ng.

Running the Application

(Optional) Create a Virtual Environment: Navigate to the project directory and create a uv project virtual environment:
```
uv venv .venv
. .venv/bin/activate
```
Sync Dependencies: Install all Python dependencies specified in the pyproject.toml:
```
uv sync
```
Run the Flask Application:
```
uv run openai_kokoro_tts/server.py
```

The server will start, and the API will be available at http://localhost:8000.

ONNX and Transformers Usage

Default: ONNX for CPU Inference

By default, the service is configured to use ONNX for efficient CPU-based inference. No additional setup is required.

To run the service in CPU-only mode:

docker-compose up

Enabling Transformers with GPU Acceleration

To leverage GPU acceleration with transformers:

Rename the example override file:

mv docker-compose.override.yml.example docker-compose.override.yml

Start the service with GPU support:
```
docker-compose up
```
Note: Docker automatically merges docker-compose.override.yml with docker-compose.yml if it detects it.

API Endpoints

`/v1/audio/speech`

Primary route for generating speech from text input. Requires an API key in the request header as a Bearer token.

URL: /v1/audio/speech
Method: POST
Headers: Authorization: Bearer <API_KEY>
Data (JSON):
- input (string): The input text to convert to speech.
- voice (string, optional): Voice model to use (default: "af_bella").
- response_format (string, optional): Output audio format (default: mp3).

`/v1/models`

Route for listing all available Kokoro-TTS voice models.

URL: /v1/models
Method: GET
Headers: Authorization: Bearer <API_KEY>
Response:
- A JSON object containing an array of available models.

Example Response:

{
  "models": [
    "af",
    "af_bella",
    "af_sarah",
    "am_adam",
    "am_michael",
    "bf_emma",
    "bf_isabella",
    "bm_george",
    "bm_lewis",
    "af_nicole",
    "af_sky"
  ]
}

Responsible Use

The openai-kokoro-tts project is designed for lawful, ethical, and responsible use. Users are prohibited from deploying this tool for:

Misleading or impersonating individuals.
Generating disinformation or fraudulent content.
Violating the privacy or rights of others.
Harassing, bullying, or otherwise harming individuals or communities.

By using this project, you agree to comply with all applicable laws and OpenAI's usage policies.

Privacy Notice

This tool processes text inputs to generate speech and does not store or infer additional data from inputs. It is the user’s responsibility to ensure compliance with data privacy regulations when using this tool, especially if processing sensitive or personal data.

AI Disclosure

Outputs generated using openai-kokoro-tts are AI-generated. Users should not misrepresent these outputs as human-generated, especially in contexts where such misrepresentation could harm others or violate ethical guidelines.

Acknowledgments

This project utilizes the Kokoro-TTS engine developed by hexgrad. We appreciate their work and contributions to the TTS community.

TODO

ONNX CPU inference
Transformers GPU inference
Simplify using kokoro-onnx

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
assets/images		assets/images
openai_kokoro_tts		openai_kokoro_tts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
cli_local_inference.py		cli_local_inference.py
debug_inference.sh		debug_inference.sh
docker-compose.override.yml.example		docker-compose.override.yml.example
docker-compose.yml		docker-compose.yml
onnx_tts_handler.py		onnx_tts_handler.py
pyproject.toml		pyproject.toml
setup_models.sh		setup_models.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

openai-kokoro-tts

Table of Contents

Prerequisites

Deployment Instructions

Using Docker Compose (Recommended)

Development Setup with `uv`

Installation

macOS and Linux

Windows

Environment Configuration

Setting Up Models

Ubuntu (Automated Setup)

macOS and Windows (Manual Setup)

Running the Application

ONNX and Transformers Usage

Default: ONNX for CPU Inference

Enabling Transformers with GPU Acceleration

API Endpoints

`/v1/audio/speech`

`/v1/models`

Responsible Use

Privacy Notice

AI Disclosure

Acknowledgments

TODO

License

About

Releases

Packages

Languages

License

mlnethub/openai-kokoro-tts

Folders and files

Latest commit

History

Repository files navigation

openai-kokoro-tts

Table of Contents

Prerequisites

Deployment Instructions

Using Docker Compose (Recommended)

Development Setup with uv

Installation

macOS and Linux

Windows

Environment Configuration

Setting Up Models

Ubuntu (Automated Setup)

macOS and Windows (Manual Setup)

Running the Application

ONNX and Transformers Usage

Default: ONNX for CPU Inference

Enabling Transformers with GPU Acceleration

API Endpoints

/v1/audio/speech

/v1/models

Responsible Use

Privacy Notice

AI Disclosure

Acknowledgments

TODO

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Development Setup with `uv`

`/v1/audio/speech`

`/v1/models`

Packages