Skip to content
View tybian's full-sized avatar
  • ZJU
  • Zhejiang Hangzhou, CHINA

Block or report tybian

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Paper, Code and Resources for Speech Language Model and End2End Speech Dialogue System.

150 12 Updated Nov 10, 2024
Python 183 10 Updated Feb 14, 2024

Speech, Language, Audio, Music Processing with Large Language Model

Python 645 58 Updated Jan 17, 2025

A toolkit for making real world machine learning and data analysis applications in C++

C++ 13,707 3,388 Updated Jan 19, 2025

a simple helper class to read RIFF WAVE files.

Python 1 Updated Jan 15, 2025

An end-to-end chorus detection model DeepChorus.

Python 34 8 Updated Mar 27, 2022
Python 38 1 Updated Aug 30, 2024

Transformer based on a variant of attention that is linear complexity in respect to sequence length

Python 725 68 Updated May 5, 2024

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 39,009 4,409 Updated Jan 18, 2025

chinese speech pretrained models

Shell 1,062 90 Updated Aug 23, 2024

SpeechGPT Series: Speech Large Language Models

Python 1,326 89 Updated Jul 22, 2024

Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalized Head Movement From Short Video and Speech Signal" (TMM 2022)

Python 750 147 Updated Dec 15, 2023

Phoneme Recognition using pre-trained models Wav2vec2, HuBERT and WavLM. Throughout this project, we compared specifically three different self-supervised models, Wav2vec (2019, 2020), HuBERT (2021…

Python 216 18 Updated May 9, 2022

An implementation of the Contrast Predictive Coding (CPC) method to train audio features in an unsupervised fashion.

Python 355 63 Updated Oct 12, 2021
Jupyter Notebook 115 11 Updated Jan 5, 2025

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 4,361 575 Updated Dec 26, 2024

Audio fingerprinting and recognition in Python

Python 6,480 1,438 Updated Apr 22, 2024

Olaf: Overly Lightweight Acoustic Fingerprinting is a portable acoustic fingerprinting system.

C 328 33 Updated May 16, 2024

A repository for my MSc thesis in Data Science & Machine Learning @ NTUA. A deep learning approach to audio fingerprinting for recognizing songs on real time through the microphone.

Jupyter Notebook 25 2 Updated Nov 12, 2024

Capture Screen, Audio, Cursor, Mouse Clicks and Keystrokes

C# 9,984 1,877 Updated Apr 9, 2023

eyeD3 is a Python module and command line program for processing ID3 tags. Information about mp3 files (i.e bit rate, sample frequency, play time, etc.) is also provided. The formats supported are …

Python 558 59 Updated Sep 4, 2024

A deep learning project for automated chorus detection in songs, featuring a command-line interface (CLI) tool that allows users to input a YouTube link and utilize a pre-trained CRNN model to dete…

Jupyter Notebook 16 4 Updated Oct 27, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 38,198 4,990 Updated Jan 17, 2025

Machine Learning Journal for Intermediate to Advanced Topics.

Jupyter Notebook 1,456 136 Updated Dec 14, 2024

FFmpeg for windows with x264

C 4 1 Updated Jun 9, 2020

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,523 126 Updated Jan 17, 2025

实时互动的GPT数字人

Python 361 76 Updated Dec 26, 2024

THIS REPOSITORY IS JUST MIRROR! Main development repository is https://codeberg.org/Freedium-cfd/web

Python 846 68 Updated Jan 5, 2025
Next