Skip to content

Port of Suno AI's Bark in C/C++ for fast inference

License

Notifications You must be signed in to change notification settings

ochafik/bark.cpp

This branch is 17 commits behind PABannier/bark.cpp:main.

Folders and files

NameName
Last commit message
Last commit date
Apr 22, 2024
Feb 13, 2024
Oct 29, 2023
Apr 20, 2024
Apr 22, 2024
Apr 16, 2024
Feb 13, 2024
Apr 10, 2024
Jul 30, 2023
Apr 20, 2024
Apr 20, 2024
Apr 20, 2024
Apr 20, 2024
Apr 20, 2024
Apr 10, 2024

Repository files navigation

bark.cpp

bark.cpp

Actions Status License: MIT

Roadmap / encodec.cpp / ggml

Inference of SunoAI's bark model in pure C/C++.

Description

With bark.cpp, our goal is to bring real-time realistic multilingual text-to-speech generation to the community.

  • Plain C/C++ implementation without dependencies
  • AVX, AVX2 and AVX512 for x86 architectures
  • CPU and GPU compatible backends
  • Mixed F16 / F32 precision
  • 4-bit, 5-bit and 8-bit integer quantization
  • Metal and CUDA backends

Models supported

Models we want to implement! Please open a PR :)

Demo on Google Colab (#95)


Here is a typical run using bark.cpp:

make -j && ./main -p "This is an audio generated by bark.cpp"

   __               __
   / /_  ____ ______/ /__        _________  ____
  / __ \/ __ `/ ___/ //_/       / ___/ __ \/ __ \
 / /_/ / /_/ / /  / ,<    _    / /__/ /_/ / /_/ /
/_.___/\__,_/_/  /_/|_|  (_)   \___/ .___/ .___/
                                  /_/   /_/

bark_tokenize_input: prompt: 'This is an audio generated by bark.cpp'
bark_tokenize_input: number of tokens in prompt = 513, first 8 tokens: 20795 20172 20199 33733 58966 20203 28169 20222

Generating semantic tokens: [========>                                          ] (17%)

bark_print_statistics:   sample time =    10.98 ms / 138 tokens
bark_print_statistics:  predict time =   614.96 ms / 4.46 ms per token
bark_print_statistics:    total time =   633.54 ms

Generating coarse tokens: [==================================================>] (100%)

bark_print_statistics:   sample time =     3.75 ms / 410 tokens
bark_print_statistics:  predict time =  3263.17 ms / 7.96 ms per token
bark_print_statistics:    total time =  3274.00 ms

Generating fine tokens: [==================================================>] (100%)

bark_print_statistics:   sample time =    38.82 ms / 6144 tokens
bark_print_statistics:  predict time =  4729.86 ms / 0.77 ms per token
bark_print_statistics:    total time =  4772.92 ms

write_wav_on_disk: Number of frames written = 65600.

main:     load time =   324.14 ms
main:     eval time =  8806.57 ms
main:    total time =  9131.68 ms

Here are typical audio pieces generated by bark.cpp:

audio1.mp4
audio2.mp4

Usage

Here are the steps to use Bark.cpp

Get the code

git clone --recursive https://github.com/PABannier/bark.cpp.git
cd bark.cpp
git submodule update --init --recursive

Build

In order to build bark.cpp you must use CMake:

mkdir build
cd build
cmake ..
cmake --build . --config Release

Prepare data & Run

# Install Python dependencies
python3 -m pip install -r requirements.txt

# Download the Bark checkpoints and vocabulary
python3 download_weights.py --out-dir ./models --models bark-small bark

# Convert the model to ggml format
python3 convert.py --dir-model ./models/bark-small --use-f16

# run the inference
./build/examples/main/main -m ./models/bark-small/ggml_weights.bin -p "this is an audio generated by bark.cpp" -t 4

(Optional) Quantize weights

Weights can be quantized using the following strategy: q4_0, q4_1, q5_0, q5_1, q8_0.

Note that to preserve audio quality, we do not quantize the codec model. The bulk of the computation is in the forward pass of the GPT models.

./build/examples/quantize/quantize ./ggml_weights.bin ./ggml_weights_q4.bin q4_0

Seminal papers

Contributing

bark.cpp is a continuous endeavour that relies on the community efforts to last and evolve. Your contribution is welcome and highly valuable. It can be

  • bug report: you may encounter a bug while using bark.cpp. Don't hesitate to report it on the issue section.
  • feature request: you want to add a new model or support a new platform. You can use the issue section to make suggestions.
  • pull request: you may have fixed a bug, added a features, or even fixed a small typo in the documentation, ... you can submit a pull request and a reviewer will reach out to you.

Coding guidelines

  • Avoid adding third-party dependencies, extra files, extra headers, etc.
  • Always consider cross-compatibility with other operating systems and architectures

About

Port of Suno AI's Bark in C/C++ for fast inference

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 84.8%
  • Python 14.3%
  • CMake 0.9%