Skip to content
View jfsantos's full-sized avatar

Organizations

@JuliaDSP @MuSAELab

Block or report jfsantos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching

Jupyter Notebook 539 46 Updated Jan 12, 2025

Event Relation in Text-to-Audio (TTA) Generation

Python 16 Updated Jan 2, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 6,715 408 Updated Jan 9, 2025

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.

Python 9,533 925 Updated Jan 14, 2025

Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)

Python 69 1 Updated Dec 3, 2024
Python 91 2 Updated Dec 17, 2024

A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.

Python 1,748 71 Updated Jan 2, 2025

InspireMusic: A Unified Framework for Music, Song, Audio Generation.

Python 296 26 Updated Dec 27, 2024

This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".

108 3 Updated Jan 14, 2025

Versatile Evaluation of Speech and Audio

Python 145 11 Updated Dec 31, 2024

A toolkit for processing speech data and creating speech datasets

Python 103 22 Updated Jan 10, 2025

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Python 546 26 Updated Mar 10, 2023

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 1,233 66 Updated Sep 27, 2024

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Rust 31,366 2,039 Updated Jan 14, 2025

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,390 227 Updated Jan 14, 2025

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 8,907 1,173 Updated Jan 14, 2025

Explainability for Vision Transformers

Python 886 103 Updated Mar 12, 2022

This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Ama…

HTML 212 58 Updated May 23, 2024

Minimalist ML framework for Rust

Rust 16,290 1,004 Updated Jan 13, 2025

Official repository of SepReformer for speech separation

Python 164 15 Updated Jan 13, 2025
Python 49 3 Updated Jun 28, 2023

Code repository for the paper - "Matryoshka Representation Learning"

Jupyter Notebook 443 22 Updated Feb 19, 2024
Python 490 32 Updated Jul 29, 2024
Python 7,144 558 Updated Jan 14, 2025

Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 11,732 705 Updated Jan 11, 2025

multi-task and multi-track music transcription for everyone

124 3 Updated Nov 29, 2024

Use PWM and simple low-pass filters on the output to create two simultaneous waveforms from an Arduino

C 23 5 Updated Apr 3, 2023

Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation

Python 554 77 Updated Dec 30, 2024

Python library for extracting chords from multiple sound file formats

Python 142 24 Updated Nov 4, 2022

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 183 13 Updated Apr 20, 2024
Next