Skip to content
View Jiaxin-Ye's full-sized avatar
🏪
Working in the laboratory
🏪
Working in the laboratory

Block or report Jiaxin-Ye

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

LayoutDM: Discrete Diffusion Model for Controllable Layout Generation [Inoue+, CVPR2023]

Python 215 23 Updated Oct 24, 2023

🙌 OpenHands: Code Less, Make More

Python 32,784 3,754 Updated Oct 8, 2024

A Framework of Small-scale Large Multimodal Models

Python 599 53 Updated Sep 10, 2024

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

Python 353 30 Updated Jan 25, 2024

Next-Token Prediction is All You Need

Python 866 25 Updated Oct 8, 2024
Python 484 30 Updated Feb 13, 2024

AI powered speech denoising and enhancement

Python 1,330 135 Updated Jun 21, 2024

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 6,204 659 Updated Sep 30, 2024

Noise supression using deep filtering

Python 2,404 223 Updated Jul 31, 2024

EEG-Audio-Video Dataset for Emotion Recognition in Conversations

Python 8 2 Updated Oct 7, 2024
Python 6 Updated Oct 10, 2023
Python 248 23 Updated Mar 15, 2024

This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on

Python 437 39 Updated Jun 9, 2024

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,944 417 Updated May 10, 2023
Python 48 6 Updated May 17, 2023

[CVPR 2024🔥] EditGuard: Versatile Image Watermarking for Tamper Localization and Copyright Protection

Python 144 9 Updated Oct 7, 2024

B站视频/弹幕下载器

Python 11 3 Updated Jun 23, 2024

哔哩哔哩常用API调用。支持视频、番剧、用户、频道、音频等功能。原仓库地址:https://github.com/MoyuScript/bilibili-api

Python 2,138 203 Updated Oct 7, 2024

🔎 Search for YouTube videos, channels & playlists. Get 🎞 video & 📑 playlist info using link. Get search suggestions. WITHOUT YouTube Data API v3.

Python 734 161 Updated Jun 30, 2022

🎙️ TED Talks web scraper

Python 25 7 Updated May 21, 2024

一个还算强大的Web思维导图。A relatively powerful web mind map.

JavaScript 6,189 870 Updated Sep 30, 2024

Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis

186 12 Updated Sep 25, 2024

Official source code for the paper: "Reading Between the Frames Multi-Modal Non-Verbal Depression Detection in Videos"

Python 40 6 Updated May 16, 2024

Github of the FaceForensics dataset

Python 2,355 533 Updated Dec 8, 2022

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching

Jupyter Notebook 653 81 Updated Oct 8, 2024

无需情感标注的情感可控语音合成模型,基于VITS

Jupyter Notebook 1,316 167 Updated Mar 30, 2023

This is the GitHub page for publicly available emotional speech data.

316 22 Updated Jan 6, 2022

Official repository of the xLSTM.

Python 1,279 92 Updated Sep 7, 2024

Automatic Depression Detection: a GRU/ BiLSTM-based Model and An Emotional Audio-Textual Corpus

Python 128 31 Updated Jul 10, 2023
Next