Skip to content
View enotdima's full-sized avatar

Block or report enotdima

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,232 1,070 Updated Mar 25, 2025

Open source real-time translation app for Android that runs locally

C++ 7,684 624 Updated Mar 23, 2025

Easily train a good VC model with voice data <= 10 mins!

Python 28,732 4,052 Updated Nov 24, 2024

Face Editor for Stable Diffusion

Python 1,055 89 Updated Sep 15, 2024

ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)

Python 219 56 Updated Apr 5, 2022

A public domain single speaker Japanese speech dataset

Python 52 5 Updated Nov 5, 2023

[CVPR 2024] Official repository for "Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians"

Python 819 60 Updated Jul 12, 2024

Open-Sora: Democratizing Efficient Video Production for All

Python 26,165 2,514 Updated Mar 27, 2025

This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.

Python 11,940 1,052 Updated Apr 2, 2025

SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

11 Updated Jun 30, 2023

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Python 9,815 986 Updated Apr 14, 2025

Pushing the Limits of Zero-shot End-to-End Speech Translation

Python 25 3 Updated Dec 12, 2024

Deep learning for audio processing

Jupyter Notebook 640 111 Updated Dec 27, 2024

Source code of "Textual Alchemy: CoFormer for Scene Text Understanding", published in WACV 2024

Python 1 1 Updated Dec 27, 2023

[CVPR 2024] PIA, your Personalized Image Animator. Animate your images by text prompt, combing with Dreambooth, achieving stunning videos. PIA,你的个性化图像动画生成器,利用文本提示将图像变为奇妙的动画

Python 958 75 Updated Aug 5, 2024

[SIGGRAPH Asia 2023] Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation

Jupyter Notebook 2,978 200 Updated Mar 9, 2024

🔥 [CVPR 2020] STEFANN: Scene Text Editor using Font Adaptive Neural Network (official code).

Python 270 41 Updated Apr 30, 2024

Let us democratise high-resolution generation! (CVPR 2024)

Jupyter Notebook 2,011 226 Updated Apr 15, 2024

Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"

Jupyter Notebook 997 46 Updated Aug 12, 2024
Python 92 7 Updated Aug 1, 2024
Python 56 4 Updated Jul 25, 2023

The pytorch implementation of our CVPR 2023 paper "Conditional Image-to-Video Generation with Latent Flow Diffusion Models"

Python 460 42 Updated Jun 18, 2024
639 10 Updated Apr 6, 2023

One-click Face Swapper and Restoration powered by insightface 🔥

Python 605 99 Updated Apr 16, 2024

Industry leading face manipulation platform

Python 22,436 3,417 Updated Apr 16, 2025

WavJourney: Compositional Audio Creation with LLMs

Python 535 43 Updated Sep 28, 2023

🎼 text-to-video system for music visualization

Python 1 Updated Jun 25, 2023

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 1 Updated Jun 18, 2023

faster-whisper livestream translation, OBS noise reduction, dual language subtitles

Python 78 7 Updated Apr 26, 2023
Next