Skip to content

Commit

Permalink
Merge pull request babysor#13 from babysor/chineseinputsupport
Browse files Browse the repository at this point in the history
Toolbox add Chinese character input support
  • Loading branch information
babysor authored Aug 16, 2021
2 parents 57b06a2 + f6306b5 commit e66d298
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 12 deletions.
2 changes: 2 additions & 0 deletions README-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@
然后您可以尝试使用工具箱:
`python demo_toolbox.py -d <datasets_root>`

> Good news🤩: 可直接使用中文
## TODO
- [X] 允许直接使用中文
- [X] 添加演示视频
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@ You can then try the toolbox:
or
`python demo_toolbox.py`

> Good news🤩: Chinese Characters are supported
## TODO
- [x] Add demo video
- [X] Add support for more dataset
Expand Down
6 changes: 5 additions & 1 deletion synthesizer/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
from typing import Union, List
import numpy as np
import librosa

from pypinyin import lazy_pinyin, Style

class Synthesizer:
sample_rate = hparams.sample_rate
Expand Down Expand Up @@ -91,6 +91,10 @@ def synthesize_spectrograms(self, texts: List[str],
simple_table([("Tacotron", str(tts_k) + "k"),
("r", self._model.r)])

#convert chinese char to pinyin
list_of_pinyin = lazy_pinyin(texts, style=Style.TONE3)
texts = [" ".join([v for v in list_of_pinyin if v.strip()])]

# Preprocess text inputs
inputs = [text_to_sequence(text.strip(), hparams.tts_cleaner_names) for text in texts]
if not isinstance(embeddings, list):
Expand Down
13 changes: 2 additions & 11 deletions toolbox/ui.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,17 +36,8 @@
], dtype=np.float) / 255

default_text = \
"Welcome to the toolbox! To begin, load an utterance from your datasets or record one " \
"yourself.\nOnce its embedding has been created, you can synthesize any text written here.\n" \
"The synthesizer expects to generate " \
"outputs that are somewhere between 5 and 12 seconds.\nTo mark breaks, write a new line. " \
"Each line will be treated separately.\nThen, they are joined together to make the final " \
"spectrogram. Use the vocoder to generate audio.\nThe vocoder generates almost in constant " \
"time, so it will be more time efficient for longer inputs like this one.\nOn the left you " \
"have the embedding projections. Load or record more utterances to see them.\nIf you have " \
"at least 2 or 3 utterances from a same speaker, a cluster should form.\nSynthesized " \
"utterances are of the same color as the speaker whose voice was used, but they're " \
"represented with a cross."
"欢迎使用工具箱, 现已支持中文输入!"



class UI(QDialog):
Expand Down

0 comments on commit e66d298

Please sign in to comment.