Skip to content
View zhizhengwu's full-sized avatar

Highlights

  • Pro

Block or report zhizhengwu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…

477 33 Updated Sep 6, 2024

[NeurIPS 2024] SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words

Python 38 1 Updated Jun 25, 2024

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 353 30 Updated Jan 25, 2024

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 4,499 387 Updated Sep 23, 2024

[being rewritten] Cross-platform iMessage POC

Python 3,560 396 Updated Jun 3, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 19,638 2,161 Updated Aug 12, 2024

AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. It aims to benchmark the robustness of ASV models in the face…

HTML 11 Updated Nov 21, 2023

Community interface for generative AI

TypeScript 8,713 864 Updated Apr 30, 2024

An open-source tool-augmented conversational language model from Fudan University

Python 11,926 1,145 Updated Jul 13, 2024

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

C++ 69,790 7,634 Updated Oct 8, 2024

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,405 221 Updated Jun 2, 2024

LibriVoc is a new open-source, large-scale dataset for vocoder artifact detection. LibriVoc is derived from the LibriTTS speech corpus, which is widely used in text-to- speech research. The LibriTT…

Rich Text Format 16 1 Updated Jan 24, 2023

Official PyTorch implementation of BigVGAN (ICLR 2023)

Python 856 97 Updated Sep 5, 2024

Think DSP: Digital Signal Processing in Python, by Allen B. Downey.

Jupyter Notebook 3,942 3,205 Updated May 10, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,593 5,766 Updated Aug 19, 2024

Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥

Python 1,251 119 Updated Dec 1, 2023
Python 36 1 Updated Mar 26, 2024

Muzic: Music Understanding and Generation with Artificial Intelligence

Python 4,492 439 Updated Oct 8, 2024

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 34,437 4,166 Updated Aug 16, 2024

Deep neural networks for voice conversion (voice style transfer) in Tensorflow

Python 3,918 843 Updated Sep 30, 2022

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Python 11,672 3,451 Updated Oct 8, 2024

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Python 52,336 8,752 Updated Aug 14, 2024

Fatcord's Alternative WaveRNN (Faster training)

Python 126 72 Updated Mar 29, 2019

WaveRNN Vocoder + TTS

Python 2,131 697 Updated Jul 2, 2022

The Economist 经济学人,持续更新

3,592 544 Updated Jun 23, 2023

Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Jupyter Notebook 5,058 1,378 Updated Jun 12, 2024

OpenFace – a state-of-the art tool intended for facial landmark detection, head pose estimation, facial action unit recognition, and eye-gaze estimation.

MATLAB 6,881 1,842 Updated Jun 1, 2024

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model

Python 1,828 436 Updated Jan 17, 2022
Next