Python WebRTC Server for Voice Interaction with LLM

This Python web server provides a WebRTC interface to allow users to interact with a Large Language Model (LLM) via voice. The server uses various libraries and models to handle WebRTC, voice activity detection, speech-to-text, natural language processing, and text-to-speech functionalities.

Features

WebRTC Support: Uses aiortc for real-time audio streaming.
Voice Activity Detection: Utilizes Silero VAD to detect when the user starts and stops speaking.
Speech-to-Text: Integrates Whisper for high-quality transcription of spoken words.
Natural Language Processing: Implements Llama 3.1 8B Instruct for understanding and generating responses.
Text-to-Speech: Uses MeloTTS to convert responses back to speech.
Optimized for Mac: Employs the mlx versions of the models for optimized performance on Mac systems.

Installation

Prerequisites

Python 3.8 or higher
pipenv for dependency management
Dependencies listed in Pipfile
Access to mlx versions of models

Steps

Clone the repository:

git clone https://github.com/paulingalls/versey-ai.git
cd versey-ai

Make sure pipenv is installed:
```
pip install pipenv --user
```

Install the required packages:

pipenv install
pipenv run python -m unidic download

Usage

Start the server:
```
pipenv run python ./server.py
```
Wait till it says the server is ready (this could take a little time while it downloads the model files)
Open a browser and navigate to http://localhost:8080 to interact with the LLM via the WebRTC interface.
Click the start button
Wait until it says -open in the data channel (this could take a bit the first time as it downloads the models)
Start talking

Contributing

Contributions are welcome! Please submit a pull request or open an issue to discuss any changes or enhancements.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.idea		.idea
models		models
utils		utils
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
client.js		client.js
index.html		index.html
server.py		server.py
versey-ai.iml		versey-ai.iml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python WebRTC Server for Voice Interaction with LLM

Features

Installation

Prerequisites

Steps

Usage

Contributing

License

Acknowledgements

About

Releases

Packages

Languages

License

paulingalls/versey-ai

Folders and files

Latest commit

History

Repository files navigation

Python WebRTC Server for Voice Interaction with LLM

Features

Installation

Prerequisites

Steps

Usage

Contributing

License

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages