Skip to content

Commit

Permalink
Merge branch 'master' of github.com:pytorch/hub into enable_cuda_test
Browse files Browse the repository at this point in the history
  • Loading branch information
Ailing Zhang committed Jun 11, 2019
2 parents 58ba9f5 + a98e1fc commit 2c8a2bd
Show file tree
Hide file tree
Showing 25 changed files with 263 additions and 9 deletions.
9 changes: 4 additions & 5 deletions docs/template.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ layout: hub_detail
background-class: hub-background
body-class: hub
category: researchers
<!-- Only change fields below -->
<!-- Only change fields below(remove this line before submitting a PR) -->
title: <REQUIRED: short model name>
summary: <REQUIRED: 1-2 sentences>
image: <REQUIRED: best image to represent your model>
Expand All @@ -12,19 +12,18 @@ tags: <REQUIRED: [tag1, tag2, ...]>
github-link: <REQUIRED>
featured_image_1: <OPTIONAL: use no-image if not applicable>
featured_image_2: <OPTIONAL: use no-image if not applicable>
accelerator: <OPTIONAL: Current supported values: "cuda", "cuda-optional">
---
<!-- REQUIRED: provide a working script to demonstrate it works with torch.hub, example below -->
```python
import torch
torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
```
<!-- Walkthrough a small example of using your model. Ideally, less than 25 lines of code -->

<!-- REQUIRED: detailed model description below, in markdown format, feel free to add new sections as necessary -->
### Model Description

<!-- OPTIONAL: put special requirement of your model here, e.g. only supports Python3 -->
### Requiresments

<!-- OPTIONAL: put link to referece papers -->
<!-- OPTIONAL: put link to reference papers -->
### References

2 changes: 2 additions & 0 deletions facebookresearch_pytorch-gan-zoo_dcgan.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision, generative]
github-link: https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/DCGAN.py
featured_image_1: dcgan_fashionGen.jpg
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions facebookresearch_pytorch-gan-zoo_pgan.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision, generative]
github-link: https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/progressive_gan.py
featured_image_1: pgan_mix.jpg
featured_image_2: pgan_celebaHQ.jpg
accelerator: cuda-optional
order: 3
---


Expand Down
2 changes: 2 additions & 0 deletions huggingface_pytorch-pretrained-bert_bert.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [nlp]
github-link: https://github.com/huggingface/pytorch-pretrained-BERT.git
featured_image_1: bert1.png
featured_image_2: bert2.png
accelerator: cuda-optional
order: 2
---

### Model Description
Expand Down
2 changes: 2 additions & 0 deletions huggingface_pytorch-pretrained-bert_gpt.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [nlp]
github-link: https://github.com/huggingface/pytorch-pretrained-BERT.git
featured_image_1: GPT1.png
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

### Model Description
Expand Down
Binary file added images/ncf_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/nvidia_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/tacotron2_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/waveglow_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
109 changes: 109 additions & 0 deletions nvidia_deeplearningexamples_tacotron2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
layout: hub_detail
background-class: hub-background
body-class: hub
title: Tacotron 2
summary: The Tacotron 2 model for generating mel spectrograms from text
category: researchers
image: nvidia_logo.png
author: NVIDIA
tags: [audio]
github-link: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2
featured_image_1: tacotron2_diagram.png
featured_image_2: no-image
accelerator: cuda
order: 10
---

```python
import torch
hub_model = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_tacotron2')
```
will load the Tacotron2 model pre-trained on [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/)

### Model Description

The Tacotron 2 and WaveGlow model form a text-to-speech system that enables user to synthesise a natural sounding speech from raw transcripts without any additional prosody information. The Tacotron 2 model produces mel spectrograms from input text using encoder-decoder architecture. WaveGlow (also available via torch.hub) is a flow-based model that consumes the mel spectrograms to generate speech.

This implementation of Tacotron 2 model differs from the model described in the paper. Our implementation uses Dropout instead of Zoneout to regularize the LSTM layers.

### Example

In the example below:
- pretrained Tacotron2 and Waveglow models are loaded from torch.hub
- Tacotron2 generates mel spectrogram given tensor represantation of an input text ("Hello world, I missed you")
- Waveglow generates sound given the mel spectrogram
- the output sound is saved in an 'audio.wav' file

To run the example you need some extra python packages installed.
These are needed for preprocessing the text and audio, as well as for display and input / output.

```bash
pip install numpy scipy librosa unidecode inflect librosa
```

Now, let's make the model say *"hello world, I missed you"*

```python
text = "hello world, I missed you"
```

```python
import numpy as np
from scipy.io.wavfile import write
```

Prepare tacotron2 for inference

```python
tacotron2 = tacotron2.to('cuda')
tacotron2.eval()
```

Load waveglow from PyTorch Hub

```python
waveglow = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_waveglow')
waveglow = waveglow.remove_weightnorm(waveglow)
waveglow = waveglow.to('cuda')
waveglow.eval()
```

Now chain pre-processing -> tacotron2 -> waveglow

```python
# preprocessing
sequence = np.array(tacotron2.text_to_sequence(text, ['english_cleaners']))[None, :]
sequence = torch.from_numpy(sequence).to(device='cuda', dtype=torch.int64)

# run the models
with torch.no_grad():
_, mel, _, _ = tacotron2.infer(sequence)
audio = waveglow.infer(mel)
audio_numpy = audio[0].data.cpu().numpy()
rate = 22050
```

You can write it to a file and listen to it

```python
write("audio.wav", rate, audio_numpy)
```


Alternatively, play it right away in a notebook with IPython widgets

```python
from IPython.display import Audio
Audio(audio_numpy, rate=rate)
```

### Details
For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)

### References

- [Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions](https://arxiv.org/abs/1712.05884)
- [WaveGlow: A Flow-based Generative Network for Speech Synthesis](https://arxiv.org/abs/1811.00002)
- [Tacotron2 and WaveGlow on NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)
- [Tacotron2 and Waveglow on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2)
107 changes: 107 additions & 0 deletions nvidia_deeplearningexamples_waveglow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
layout: hub_detail
background-class: hub-background
body-class: hub
title: WaveGlow
summary: WaveGlow model for generating speech from mel spectrograms (generated by Tacotron2)
category: researchers
image: nvidia_logo.png
author: NVIDIA
tags: [audio]
github-link: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2
featured_image_1: waveglow_diagram.png
featured_image_2: no-image
accelerator: cuda
order: 10
---

```python
import torch
waveglow = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_waveglow')
```
will load the WaveGlow model pre-trained on [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/)

### Model Description

The Tacotron 2 and WaveGlow model form a text-to-speech system that enables user to synthesise a natural sounding speech from raw transcripts without any additional prosody information. The Tacotron 2 model (also available via torch.hub) produces mel spectrograms from input text using encoder-decoder architecture. WaveGlow is a flow-based model that consumes the mel spectrograms to generate speech.

### Example

In the example below:
- pretrained Tacotron2 and Waveglow models are loaded from torch.hub
- Tacotron2 generates mel spectrogram given tensor represantation of an input text ("Hello world, I missed you")
- Waveglow generates sound given the mel spectrogram
- the output sound is saved in an 'audio.wav' file

To run the example you need some extra python packages installed.
These are needed for preprocessing the text and audio, as well as for display and input / output.

```bash
pip install numpy scipy librosa unidecode inflect librosa
```

Now, let's make the model say *"hello world, I missed you"*

```python
text = "hello world, I missed you"
```

```python
import numpy as np
from scipy.io.wavfile import write
```

Prepare the waveglow model for inference

```python
waveglow = waveglow.remove_weightnorm(waveglow)
waveglow = waveglow.to('cuda')
waveglow.eval()
```

Load tacotron2 from PyTorch Hub

```python
tacotron2 = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_tacotron2')
tacotron2 = tacotron2.to('cuda')
tacotron2.eval()
```

Now chain pre-processing -> tacotron2 -> waveglow

```python
# preprocessing
sequence = np.array(tacotron2.text_to_sequence(text, ['english_cleaners']))[None, :]
sequence = torch.from_numpy(sequence).to(device='cuda', dtype=torch.int64)

# run the models
with torch.no_grad():
_, mel, _, _ = tacotron2.infer(sequence)
audio = waveglow.infer(mel)
audio_numpy = audio[0].data.cpu().numpy()
rate = 22050
```

You can write it to a file and listen to it

```python
write("audio.wav", rate, audio_numpy)
```


Alternatively, play it right away in a notebook with IPython widgets

```python
from IPython.display import Audio
Audio(audio_numpy, rate=rate)
```

### Details
For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)

### References

- [Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions](https://arxiv.org/abs/1712.05884)
- [WaveGlow: A Flow-based Generative Network for Speech Synthesis](https://arxiv.org/abs/1811.00002)
- [Tacotron2 and WaveGlow on NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)
- [Tacotron2 and Waveglow on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2)
2 changes: 2 additions & 0 deletions pytorch_vision_alexnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py
featured_image_1: alexnet1.png
featured_image_2: alexnet2.png
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_deeplabv3_resnet101.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/segmentation/deeplabv3.py
featured_image_1: deeplab1.png
featured_image_2: deeplab2.png
accelerator: cuda-optional
order: 1
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_densenet.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py
featured_image_1: densenet1.png
featured_image_2: densenet2.png
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_fcn_resnet101.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/segmentation/fcn.py
featured_image_1: deeplab1.png
featured_image_2: fcn2.png
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_googlenet.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/googlenet.py
featured_image_1: googlenet1.png
featured_image_2: googlenet2.png
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_inception_v3.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/inception.py
featured_image_1: inception_v3.png
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_mobilenet_v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/mobilenet.py
featured_image_1: mobilenet_v2_1.png
featured_image_2: mobilenet_v2_2.png
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_resnet.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py
featured_image_1: resnet.png
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_resnext.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py
featured_image_1: resnext.png
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_shufflenet_v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/shufflenetv2.py
featured_image_1: shufflenet_v2_1.png
featured_image_2: shufflenet_v2_2.png
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_squeezenet.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/squeezenet.py
featured_image_1: squeezenet.png
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

```python
Expand Down
2 changes: 2 additions & 0 deletions pytorch_vision_vgg.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ tags: [vision]
github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py
featured_image_1: vgg.png
featured_image_2: no-image
accelerator: cuda-optional
order: 10
---

```python
Expand Down
Loading

0 comments on commit 2c8a2bd

Please sign in to comment.