Skip to content

YgorCastor/nx_audio

Repository files navigation

NxAudio

NxAudio is an Elixir library for working with audio tensors, providing functionality similar to Python's torchaudio but built for the Nx ecosystem.

Features

  • Audio I/O operations with support for multiple formats
  • Audio transformations and processing
    • Amplitude to DB
    • MEL Spectrogram and STFT
  • Spectrogram visualizations
  • Multiple codec support including:
    • PCM formats (S16, S24, S32, S8, U8, F32, F64)
    • FLAC
    • MP3
    • Vorbis
    • Opus
    • AMR (NB/WB)
    • μ-law and A-law
    • HTK

Installation

Add nx_audio to your list of dependencies in mix.exs:

def deps do
  [
    {:nx_audio, "~> 0.1.0"}
  ]
end

Dependencies

NxAudio requires:

  • Elixir ~> 1.17
  • FFmpeg for audio processing capabilities
  • Nx for tensor operations

Usage Examples

Basic audio operations:

# Reading an audio file
{:ok, {tensor, sample_rate}} = NxAudio.IO.load("path/to/audio.mp3")

# Generating spectrograms
spectrogram = NxAudio.Transforms.Spectrogram.transform(tensor, sample_rate: sample_rate)

Documentation

Detailed documentation is organized into the following sections:

  • IO - Audio file reading/writing operations
  • Transformations - Audio signal processing functions
  • Visualizations - Spectrogram and waveform visualization tools
  • Codecs - Supported audio format encodings

For more examples and detailed API documentation, visit the official documentation.

License

This project is licensed under the MIT License.