Skip to content
View Lhx94As's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report Lhx94As

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
130 results for source starred repositories
Clear filter

Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deplo…

C 932 153 Updated Dec 24, 2024
Python 8 2 Updated Jul 15, 2024
Python 91 9 Updated Jul 6, 2024

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Python 607 80 Updated Dec 27, 2023

Code for paper "Noise-aware Speech Enhancement using Diffusion Probabilistic Model"

Python 84 2 Updated Jun 10, 2024
Python 27 2 Updated Jul 1, 2024

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,248 117 Updated Jul 11, 2024

Official repository for the "Powerset multi-class cross entropy loss for neural speaker diarization" paper published in Interspeech 2023.

Jupyter Notebook 75 4 Updated Oct 18, 2023

深度学习经典、新论文逐段精读

27,789 2,481 Updated Nov 17, 2024

强化学习中文教程(蘑菇书🍄),在线阅读地址:https://datawhalechina.github.io/easy-rl/

Jupyter Notebook 9,901 1,908 Updated Nov 8, 2024

[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond

Python 391 30 Updated Apr 23, 2024

Variational Bayes HMM over x-vectors diarization

Python 259 57 Updated Jan 15, 2024

AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Python 10,087 868 Updated Jul 6, 2024

ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS

Shell 27 1 Updated Mar 16, 2023

Python toolkit for speech processing

Python 68 21 Updated Jan 9, 2025

A torch implementation of a recursion which turns out to be useful for RNN-T.

Python 140 22 Updated Aug 25, 2023

Robust Speech Recognition via Large-Scale Weak Supervision

Python 74,169 8,862 Updated Jan 4, 2025

A curated list of awesome Speech Enhancement papers, libraries, datasets, and other resources.

66 15 Updated Sep 9, 2019

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 12,976 902 Updated Oct 3, 2024

A PyTorch-based Speech Toolkit

Python 9,175 1,416 Updated Jan 11, 2025

This repo contains my attempt to create a Speaker Recognition and Verification system using SideKit-1.3.1

Python 110 33 Updated May 22, 2019

State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

Python 3,560 311 Updated Jan 4, 2024

Source code for: Efficient Self-supervised Learning Representations for Spoken Language Identification

Python 4 Updated Sep 13, 2022

[NeurIPS 2023] Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective

Jupyter Notebook 122 11 Updated Oct 25, 2023
Python 5 2 Updated Nov 23, 2021

Final project for the Speaker Recognition course on Udemy, 机器之心, 深蓝学院 and 语音之家

Python 43 14 Updated May 7, 2024

LeetCode Solutions: A Record of My Problem Solving Journey.( leetcode题解,记录自己的leetcode解题之路。)

JavaScript 54,924 9,466 Updated Dec 10, 2024

PHO-LID: A Unified Model to Incorporate Acoustic-Phonetic and Phonotactic Information for Language Identification

Python 21 2 Updated Aug 24, 2023
Next