High Latency in TTS Synthesis on Android with Screen Readers #1337

divineDev-dotcom · 2024-09-11T09:47:11Z

Hello,

I am using your TTS as an Android Text-to-Speech (TTS) engine for offline use, but I have encountered an issue with the audio synthesis. It is taking approximately 500 ms to speak the text on the screen when using screen readers on Android in the onSynthesize method.

Is there any option or solution to reduce this latency? I am trying to create a TTS system for Android that blind/low-vision users can use effectively with their screen readers, so minimizing latency is critical for accessibility.

The text was updated successfully, but these errors were encountered:

yuyun2000 · 2024-09-11T09:50:09Z

Use the piper model, which has a faster inference speed; 2. Adjust the inference thread, if the thread is 1; 3. Use a better mobile phone

csukuangfj · 2024-09-11T10:55:07Z

Which model are you using and which kind of android phone, i.e., the CPU of your phone, are you using?

nanaghartey · 2024-09-12T02:42:52Z

From experience, latency depends on the kind of model, the device's processing power (CPU), and the length of the text being processed. Piper's "medium" quality models (around 60mb) have lower latency , compared to other models.

I was able to speed up inference by splitting the input text into batches, using punctuation as natural sentence boundaries, This allows for quicker synthesis of smaller chunks of text. On more powerful devices, this step may not be necessary, as they can handle larger texts efficiently. You can adjust the batching based on the device's capabilities.

divineDev-dotcom · 2024-09-12T10:22:46Z

Which model are you using and which kind of android phone, i.e., the CPU of your phone, are you using?

here are the details:
phone: Motorola edge 40(processor: MediaTek Dimensity 7030 (6nm))
Model: vits-piper-en_US-lessac-medium

csukuangfj · 2024-09-13T08:51:44Z

phone: Motorola edge 40(processor: MediaTek Dimensity 7030 (6nm))

This phone has
2 Cortex-A78 and 6 Cortex-A55 CPUs.

If it uses Cortex A78 during synthesis, then it should be very fast.

If it uses Cortex A55, then it would be slow.

divineDev-dotcom closed this as completed Sep 12, 2024

divineDev-dotcom reopened this Sep 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High Latency in TTS Synthesis on Android with Screen Readers #1337

High Latency in TTS Synthesis on Android with Screen Readers #1337

divineDev-dotcom commented Sep 11, 2024

yuyun2000 commented Sep 11, 2024

csukuangfj commented Sep 11, 2024

nanaghartey commented Sep 12, 2024

divineDev-dotcom commented Sep 12, 2024

csukuangfj commented Sep 13, 2024

High Latency in TTS Synthesis on Android with Screen Readers #1337

High Latency in TTS Synthesis on Android with Screen Readers #1337

Comments

divineDev-dotcom commented Sep 11, 2024

yuyun2000 commented Sep 11, 2024

csukuangfj commented Sep 11, 2024

nanaghartey commented Sep 12, 2024

divineDev-dotcom commented Sep 12, 2024

csukuangfj commented Sep 13, 2024