misumisumi

MiSumiSumi misumisumi

Graduate student of audio processing

16 followers · 16 following

Achievements

Highlights

Lists (31)

Sort

Stars

129 stars written in Python

Clear filter

AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI

Python 144,995 27,227 Updated Dec 28, 2024

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 136,859 27,404 Updated Dec 27, 2024

openai / whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Python 73,441 8,766 Updated Dec 1, 2024

zylon-ai / private-gpt

Interact with your documents using the power of GPT, 100% privately, no data leaks

Python 54,548 7,329 Updated Nov 13, 2024

run-llama / llama_index

LlamaIndex is a data framework for your LLM applications

Python 37,632 5,407 Updated Dec 27, 2024

LAION-AI / Open-Assistant

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

Python 37,131 3,249 Updated Aug 17, 2024

facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,735 6,433 Updated Oct 18, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 27,667 3,158 Updated Aug 12, 2024

deezer / spleeter

Deezer source separation library including pretrained models.

Python 26,106 2,866 Updated Oct 29, 2024

RVC-Project / Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!

Python 25,534 3,717 Updated Nov 24, 2024

junyanz / pytorch-CycleGAN-and-pix2pix

Image-to-Image Translation in PyTorch

Python 23,329 6,345 Updated May 14, 2024

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Python 21,227 2,186 Updated Nov 11, 2024

fishaudio / fish-speech

SOTA Open Source TTS

Python 17,797 1,330 Updated Dec 25, 2024

khoj-ai / khoj

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous …

Python 17,380 850 Updated Dec 27, 2024

w-okada / voice-changer

リアルタイムボイスチェンジャー Realtime Voice Changer

Python 16,804 1,827 Updated Nov 14, 2024

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,947 906 Updated Oct 22, 2024

BlinkDL / RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…

Python 12,864 875 Updated Dec 26, 2024

NVIDIA / NeMo

A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Python 12,552 2,576 Updated Dec 28, 2024

flet-dev / flet

Flet enables developers to easily build realtime web, mobile and desktop apps in Python. No frontend experience required.

Python 11,923 467 Updated Dec 19, 2024

milesial / Pytorch-UNet

PyTorch implementation of the U-Net for image semantic segmentation with high quality images

Python 9,496 2,532 Updated Aug 11, 2024

espnet / espnet

End-to-End Speech Processing Toolkit

Python 8,626 2,200 Updated Dec 28, 2024

facebookresearch / demucs

Code for the paper Hybrid Spectrogram and Waveform Source Separation

Python 8,514 1,089 Updated Apr 24, 2024

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 8,461 1,089 Updated Dec 27, 2024

fishaudio / Bert-VITS2

vits2 backbone with multilingual-bert

Python 8,127 1,151 Updated Dec 23, 2024

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 7,969 600 Updated Dec 27, 2024

Plachtaa / VALL-E-X

An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io/vallex/

Python 7,743 769 Updated Feb 11, 2024

netease-youdao / EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

Python 7,556 641 Updated Aug 13, 2024

lllyasviel / IC-Light

More relighting!

Python 7,155 415 Updated Nov 28, 2024

sissbruecker / linkding

Self-hosted bookmark manager that is designed be to be minimal, fast, and easy to set up using Docker.

Python 7,078 330 Updated Dec 18, 2024

kyutai-labs / moshi

Python 7,056 550 Updated Dec 20, 2024

MiSumiSumi misumisumi

Highlights

Lists (31)

👍 LifeTips

ASR

📖 LLM

📷 Vision

🐈 VRChat

☕ coffee break

Develop Envs

English

fediverse

github-workflow

GUI

k8s

latex

macos

🔍 ML

📝 Editor

MusicSourceSeparation

neovim

obsidian

📦 nix

📄 Dataset

📎 Paper and Docs

🐕‍🦺 Server

Shell

🔉 DSP

🔉 TTS

🔉 VC

🔉 Vocoder

speech-processing

🎮 Emurator

vm

Stars