- Vancouver, BC, Canada
- http://www.seaandsailor.com
- @seaandsailor
Stars
- All languages
- AGS Script
- AppleScript
- Arduino
- Assembly
- C
- C++
- CMake
- CSS
- Common Lisp
- Cuda
- Cython
- D
- Eagle
- Emacs Lisp
- Fortran
- G-code
- Go
- Groff
- Groovy
- HTML
- Haskell
- Haxe
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MATLAB
- Makefile
- Max
- Mojo
- NASL
- Objective-C
- Objective-C++
- OpenEdge ABL
- POV-Ray SDL
- Perl
- Perl 6
- PowerShell
- Processing
- Python
- QML
- R
- Roff
- Ruby
- Rust
- SCSS
- Scheme
- Shell
- Swift
- TeX
- TypeScript
- VHDL
- Vala
- Verilog
- Vim Script
- Zig
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Official PyTorch implementation of "Paralinguistics-Aware Speech-Empowered LLMs for Natural Conversation" (NeurIPS 2024)
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
InspireMusic: A Unified Framework for Music, Song, Audio Generation.
This is an evolving repo for the paper "Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey".
A toolkit for processing speech data and creating speech datasets
A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
This is an open-source implementation of the ITU P.808 standard for "Subjective evaluation of speech quality with a crowdsourcing approach" (see https://www.itu.int/rec/T-REC-P.808/en). It uses Ama…
Official repository of SepReformer for speech separation
Code repository for the paper - "Matryoshka Representation Learning"
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
multi-task and multi-track music transcription for everyone
Use PWM and simple low-pass filters on the output to create two simultaneous waveforms from an Arduino
Score-based Generative Models (Diffusion Models) for Speech Enhancement and Dereverberation
Python library for extracting chords from multiple sound file formats
FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3