Skip to content
View windowso's full-sized avatar
  • Tsinghua University,Department of Electronic Engineering
  • Beijing,China

Highlights

  • Pro

Block or report windowso

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

a high-performance, POSIX-ish Amazon S3 file system written in Go

Go 5,298 527 Updated Jul 18, 2024

Evaluation Protocol for Large-Scale Zero-Shot TTS Literature

Python 75 9 Updated Mar 12, 2025

Github Pages template based upon HTML and Markdown for personal, portfolio-based websites.

JavaScript 13,528 46,107 Updated Mar 2, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 11,873 1,181 Updated Mar 13, 2025

The official implementation of paper SWIM: SHORT-WINDOW CNN INTEGRATED WITH MAMBA FOR EEG-BASED AUDITORY SPATIAL ATTENTION DECODING

Jupyter Notebook 15 3 Updated Oct 16, 2024

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 385 23 Updated Mar 13, 2025

Learn fast, scalable, and calibrated measures of uncertainty using neural networks!

Python 461 95 Updated Aug 31, 2021

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,333 75 Updated Sep 27, 2024

Repo for counting stars and contributing. Press F to pay respect to glorious developers.

270,420 21,110 Updated Oct 3, 2024

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Python 9,011 1,101 Updated Oct 9, 2024

Generative models for conditional audio generation

Python 2,952 294 Updated Feb 28, 2025

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,946 625 Updated May 31, 2024

Mamba SSM architecture

Python 14,219 1,238 Updated Jan 18, 2025

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,675 675 Updated Mar 3, 2025
C 215 54 Updated Nov 27, 2023

Evaluate your speech-to-text system with similarity measures such as word error rate (WER)

Python 697 101 Updated Feb 15, 2025

Reference-aware automatic speech evaluation toolkit

Python 144 12 Updated Dec 5, 2024

A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR

Python 951 160 Updated Jul 5, 2023

Zero-Shot Speech Editing and Text-to-Speech in the Wild

Jupyter Notebook 8,177 782 Updated Jun 24, 2024

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

1,860 236 Updated Jun 6, 2024

Collection of resources on the applications of Large Language Models (LLMs) in Audio AI.

657 38 Updated Aug 3, 2024

Daily tracking of awesome audio papers, including music generation, zero-shot tts, asr, audio generation

374 15 Updated Mar 10, 2025

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html

Python 2,105 321 Updated Nov 14, 2023

An unofficial PyTorch implementation of VALL-E

Python 88 7 Updated Mar 13, 2025

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,988 417 Updated May 10, 2023

Hackable and optimized Transformers building blocks, supporting a composable construction.

Python 9,170 652 Updated Mar 9, 2025

Fast and memory-efficient exact attention

Python 16,280 1,542 Updated Mar 13, 2025

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,641 2,272 Updated Mar 13, 2025

g2p: English Grapheme To Phoneme Conversion

Python 841 129 Updated Jan 5, 2023

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 41,353 6,239 Updated Mar 13, 2025
Next