Skip to content

Commit

Permalink
typos
Browse files Browse the repository at this point in the history
  • Loading branch information
Rayhane-mamah authored Apr 9, 2018
1 parent e48447f commit 54593a0
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 7 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Tensorflow implementation of Deep mind's Tacotron-2. A deep neural network archi
The previous tree shows what the current state of the repository.

- Step **(0)**: Get your dataset, here I have set the examples of **Ljspeech**, **en_US** and **en_UK** (from **M-AILABS**).
- Step **(1)**: Preprocess your data. This will give you the **trainin_data** folder.
- Step **(1)**: Preprocess your data. This will give you the **training_data** folder.
- Step **(2)**: Train your Tacotron model. Yields the **logs-Tacotron** folder.
- Step **(3)**: Synthesize/Evaluate the Tacotron model. Gives the **tacotron_output** folder.

Expand Down Expand Up @@ -82,7 +82,7 @@ else:
# Dataset:
We tested the code above on the [ljspeech dataset](https://keithito.com/LJ-Speech-Dataset/), which has almost 24 hours of labeled single actress voice recording. (further info on the dataset are available in the README file when you download it)

We are also running current tests on the [new L-AILABS speech dataset](http://www.m-ailabs.bayern/en/the-mailabs-speech-dataset/) which contains more than 700h of speech (more than 80 Gb of data) for more than 10 languages.
We are also running current tests on the [new M-AILABS speech dataset](http://www.m-ailabs.bayern/en/the-mailabs-speech-dataset/) which contains more than 700h of speech (more than 80 Gb of data) for more than 10 languages.

After **downloading** the dataset, **extract** the compressed file, and **place the folder inside the cloned repository.**

Expand Down
5 changes: 2 additions & 3 deletions datasets/preprocessor.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,14 +106,13 @@ def _process_utterance(mel_dir, wav_dir, index, wav_path, text):

#Zero pad for quantized signal
out = np.pad(out, (l, r), mode='constant', constant_values=constant_values)
T = mel_spectrogram.shape[0]
time_steps = len(out)
assert time_steps >= T * audio.get_hop_size()
assert time_steps >= n_frames * audio.get_hop_size()

#time resolution adjustement
#ensure length of raw audio is multiple of hop size so that we can use
#transposed convolution to upsample
out = out[:T * audio.get_hop_size()]
out = out[:n_frames * audio.get_hop_size()]
assert time_steps % audio.get_hop_size() == 0

# Write the spectrogram and audio to disk
Expand Down
4 changes: 2 additions & 2 deletions preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,8 @@ def norm_data(args):


if args.dataset == 'M-AILABS':
supported_languages = ['en_US', 'en_UK', 'fr_FR', 'it_IT', 'de_DE', 'es-ES', 'ru-RU',
'uk-UK', 'pl-PL', 'nl-NL', 'pt-PT', 'sv-FI', 'sv-SE', 'tr-TR', 'ar-SA']
supported_languages = ['en_US', 'en_UK', 'fr_FR', 'it_IT', 'de_DE', 'es_ES', 'ru_RU',
'uk_UK', 'pl_PL', 'nl_NL', 'pt_PT', 'sv_FI', 'sv_SE', 'tr_TR', 'ar_SA']
if args.language not in supported_languages:
raise ValueError('Please enter a supported language to use from M-AILABS dataset! \n{}'.format(
supported_languages))
Expand Down

0 comments on commit 54593a0

Please sign in to comment.