This app utilizes PyTorch to denoise human speech, distinguishing it from background noise.
For training the model, speech datasets from Mozilla Common Voice and environmental sounds from UrbanSound8K were used.
- Clone the repository:
git clone https://github.com/v-perfilev/speech_denoiser.git
- Install the required packages:
pip install -r requirements.txt
-
Copy dataset with clean and noisy sound samples into the
../_datasets/
directory. To generate datasets you can use my another project https://github.com/v-perfilev/audio_dataset_handler.git. -
Train the model by running the
model_training.ipynb
notebook. -
Run the app:
python usage_example.py
- Real-time speech detection using a pretrained neural network model.
- Supports multiple microphone inputs.
- Lightweight and easy to deploy.
- ffmpeg (!!!)
- numpy
- matplotlib
- torchaudio
- pyaudio
- soundfile
- torch