Skip to content
View hcfeng201's full-sized avatar

Block or report hcfeng201

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

🎥 Python and OpenCV-based scene cut/transition detection program & library.

Python 3,410 408 Updated Dec 24, 2024

Implementation of StyleTTS for Mandarin

Python 11 Updated Jun 22, 2023

Foundational model for human-like, expressive TTS

Python 3,942 664 Updated Jul 30, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 36,314 4,461 Updated Aug 16, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 5,058 427 Updated Aug 10, 2024

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

Python 622 53 Updated Oct 1, 2024

UT-Sarulab MOS prediction system using SSL models

Python 197 14 Updated Apr 11, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 37,286 4,239 Updated Dec 19, 2024

Fast and memory-efficient exact attention

Python 14,758 1,394 Updated Dec 22, 2024

The official repository of Dynamic-SUPERB.

Python 167 89 Updated Nov 11, 2024

wav2vec2 audio classification for prosodic boundary detection and other tasks

Jupyter Notebook 36 6 Updated Aug 11, 2023

A python package to build AI-powered real-time audio applications

Python 1,125 90 Updated Jul 8, 2024

ModelScope: bring the notion of Model-as-a-Service to life.

Python 7,146 741 Updated Dec 23, 2024

Self-Supervised Speech Pre-training and Representation Learning Toolkit

Python 2,297 486 Updated Dec 22, 2024

Pure Pytorch Docker Images.

Shell 443 73 Updated Oct 9, 2024

Repository for fine-tuning BEATs and using BEATs as feature extractor in a prototypical network. This repository has been used to complete the DCASE2023 challenge on few-shot bioacoustic events.

Python 33 3 Updated Dec 10, 2024

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 20,463 2,572 Updated Dec 15, 2024

Some comprehensive papers about speaker diarization

237 5 Updated Nov 12, 2024

This repository contains a set of codes to run (i.e., train, perform inference with, evaluate) a diarization method called EEND-vector-clustering.

Python 73 17 Updated Oct 18, 2022

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit

Python 773 124 Updated Dec 24, 2024

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,370 111 Updated Dec 24, 2024

An awesome spoken LID repository. (Working in progress

Python 97 10 Updated Apr 22, 2024

Pytorch implementation of "spectro-temporal attention-based voice activity detection"

Python 12 3 Updated Jun 4, 2024

The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Python 373 65 Updated Aug 16, 2024

Reading list for research topics in Sound AI

167 9 Updated Aug 8, 2024

Conditional Diffusion Probabilistic Model for Speech Enhancement

Python 220 34 Updated Dec 20, 2022

Deep Learning Morse Decoder

Python 17 4 Updated Dec 8, 2022

Testing Tensorflow LSTM for Morse decoder

Python 52 27 Updated Dec 8, 2022

Morse Code detection with eyes using Computer Vision

Python 55 24 Updated Feb 22, 2019
Next