Skip to content

Commit

Permalink
update benchmark
Browse files Browse the repository at this point in the history
  • Loading branch information
shashikg committed Jan 28, 2024
1 parent fae33f7 commit ecdffe3
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Checkout the Google Colab notebooks provided here: [notebooks](notebooks)

Stay tuned for a technical report comparing WhisperS2T against other whisper pipelines. Meanwhile, check some quick benchmarks on A30 GPU. See `scripts/` directory for the benchmarking scripts that I used.

![A30 Benchmark](https://github.com/shashikg/WhisperS2T/assets/22556187/c206ef88-98fa-4ac5-b413-9c0b957a8e58)
![A30 Benchmark](https://github.com/shashikg/WhisperS2T/assets/22556187/caecbb38-b69e-4daa-bcdc-16beb9456de5)

**NOTE:** I conducted all the benchmarks using the `without_timestamps` parameter set as `True`. Adjusting this parameter to `False` may enhance the Word Error Rate (WER) of the HuggingFace pipeline but at the expense of increased inference time. Notably, the improvements in inference speed were achieved solely through a **superior pipeline design**, without any specific optimization made to the backend inference engines (such as CTranslate2, FlashAttention2, etc.). For instance, WhisperS2T (utilizing FlashAttention2) demonstrates significantly superior inference speed compared to the HuggingFace pipeline (also using FlashAttention2), despite both leveraging the same inference engine—HuggingFace whisper model with FlashAttention2. Additionally, there is a noticeable difference in the WER as well.

Expand Down

0 comments on commit ecdffe3

Please sign in to comment.