Skip to content
View ak-erichan's full-sized avatar

Block or report ak-erichan

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Rust 1 Updated Sep 29, 2024

Code for paper "Network Bending of Diffusion Models for Audio-Visual Generation" at DAFx 2024

Python 11 Updated Jun 12, 2024

Reference-aware automatic speech evaluation toolkit

Python 97 7 Updated Feb 22, 2024

Easy-to-Use Speech MOS predictors

Python 215 16 Updated Oct 24, 2023

A Representation Evaluation Framework for Music Information Retrieval tasks

Python 38 Updated Apr 9, 2024

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,189 264 Updated Jul 15, 2024
HTML 38 Updated Jun 11, 2024

Inference codebase for "Cacophony: An Improved Contrastive Audio-Text Model". Preprint: https://arxiv.org/abs/2402.06986

Python 29 3 Updated Apr 26, 2024

Source code for "Modulation Extraction for LFO-driven Audio Effects".

Python 30 4 Updated Jul 5, 2024

Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)

Python 207 22 Updated Oct 7, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 37,829 3,976 Updated Jul 28, 2024

autocorrelation-based O(NlogN) pitch detection

C++ 571 67 Updated Dec 27, 2023

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Python 183 15 Updated Apr 24, 2024

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM

TypeScript 2,688 319 Updated Aug 21, 2024

Repository for training models for music source separation.

Python 409 54 Updated Sep 27, 2024

Implementation of Band Split Roformer, SOTA Attention network for music source separation out of ByteDance AI Labs

Python 406 14 Updated Aug 6, 2024

Collection of audio-focused loss functions in PyTorch

Python 725 66 Updated Jul 30, 2024

Official codes and models of the paper "Auffusion: Leveraging the Power of Diffusion and Large Language Models for Text-to-Audio Generation"

Jupyter Notebook 146 12 Updated Mar 25, 2024

Codebase and project page for EDMSound

Python 29 1 Updated Nov 20, 2023

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

Python 1,146 106 Updated Jul 11, 2024

Self-supervised learning for fast pitch estimation

Python 175 15 Updated Oct 2, 2024

An example plugin using RTNeural with a SIMD architecture determined at run-time

CMake 7 Updated Nov 26, 2023

2-2000x faster ML algos, 50% less memory usage, works on all hardware - new and old.

Jupyter Notebook 1,792 127 Updated Jun 27, 2024

Generate new latent codes for RAVE with Denoising Diffusion models.

Python 159 9 Updated Dec 9, 2023
Python 21 1 Updated Oct 6, 2024

[CVPR 2024 Highlight] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

Python 1,009 40 Updated Jul 22, 2024

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models

Python 4,811 395 Updated Aug 10, 2024

Real-time Neural Timbre Transfer

C++ 389 12 Updated Sep 16, 2024

Differentiable audio signal processors in PyTorch

Python 225 5 Updated Dec 4, 2023
Next