Merge branch 'master' of github.com:pytorch/hub into enable_cuda_test

jianyuh · Jun 11, 2019 · 2c8a2bd · 2c8a2bd
2 parents 58ba9f5 + a98e1fc
commit 2c8a2bd
Show file tree

Hide file tree

Showing 25 changed files with 263 additions and 9 deletions.
diff --git a/docs/template.md b/docs/template.md
@@ -3,7 +3,7 @@ layout: hub_detail
 background-class: hub-background
 body-class: hub
 category: researchers
-<!-- Only change fields below -->
+<!-- Only change fields below(remove this line before submitting a PR) -->
 title: <REQUIRED: short model name>
 summary: <REQUIRED: 1-2 sentences>
 image: <REQUIRED: best image to represent your model>
@@ -12,19 +12,18 @@ tags: <REQUIRED: [tag1, tag2, ...]>
 github-link: <REQUIRED>
 featured_image_1: <OPTIONAL: use no-image if not applicable>
 featured_image_2: <OPTIONAL: use no-image if not applicable>
+accelerator: <OPTIONAL: Current supported values: "cuda", "cuda-optional">
 ---
 <!-- REQUIRED: provide a working script to demonstrate it works with torch.hub, example below -->
 ```python
 import torch
 torch.hub.load('pytorch/vision', 'resnet18', pretrained=True)
 ```
+<!-- Walkthrough a small example of using your model. Ideally, less than 25 lines of code -->
 
 <!-- REQUIRED: detailed model description below, in markdown format, feel free to add new sections as necessary -->
 ### Model Description
 
-<!-- OPTIONAL: put special requirement of your model here, e.g. only supports Python3 -->
-### Requiresments
 
-<!-- OPTIONAL: put link to referece papers -->
+<!-- OPTIONAL: put link to reference papers -->
 ### References
-
diff --git a/facebookresearch_pytorch-gan-zoo_dcgan.md b/facebookresearch_pytorch-gan-zoo_dcgan.md
@@ -11,6 +11,8 @@ tags: [vision, generative]
 github-link: https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/DCGAN.py
 featured_image_1: dcgan_fashionGen.jpg
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/facebookresearch_pytorch-gan-zoo_pgan.md b/facebookresearch_pytorch-gan-zoo_pgan.md
@@ -11,6 +11,8 @@ tags: [vision, generative]
 github-link: https://github.com/facebookresearch/pytorch_GAN_zoo/blob/master/models/progressive_gan.py
 featured_image_1: pgan_mix.jpg
 featured_image_2: pgan_celebaHQ.jpg
+accelerator: cuda-optional
+order: 3
 ---
 
 

diff --git a/huggingface_pytorch-pretrained-bert_bert.md b/huggingface_pytorch-pretrained-bert_bert.md
@@ -11,6 +11,8 @@ tags: [nlp]
 github-link: https://github.com/huggingface/pytorch-pretrained-BERT.git
 featured_image_1: bert1.png
 featured_image_2: bert2.png
+accelerator: cuda-optional
+order: 2
 ---
 
 ### Model Description

diff --git a/huggingface_pytorch-pretrained-bert_gpt.md b/huggingface_pytorch-pretrained-bert_gpt.md
@@ -11,6 +11,8 @@ tags: [nlp]
 github-link: https://github.com/huggingface/pytorch-pretrained-BERT.git
 featured_image_1: GPT1.png
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ### Model Description

diff --git a/images/ncf_diagram.png b/images/ncf_diagram.png
diff --git a/images/nvidia_logo.png b/images/nvidia_logo.png
diff --git a/images/tacotron2_diagram.png b/images/tacotron2_diagram.png
diff --git a/images/waveglow_diagram.png b/images/waveglow_diagram.png
diff --git a/nvidia_deeplearningexamples_tacotron2.md b/nvidia_deeplearningexamples_tacotron2.md
@@ -0,0 +1,109 @@
+---
+layout: hub_detail
+background-class: hub-background
+body-class: hub
+title: Tacotron 2
+summary: The Tacotron 2 model for generating mel spectrograms from text
+category: researchers
+image: nvidia_logo.png
+author: NVIDIA
+tags: [audio]
+github-link: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2
+featured_image_1: tacotron2_diagram.png
+featured_image_2: no-image
+accelerator: cuda
+order: 10
+---
+
+```python
+import torch
+hub_model = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_tacotron2')
+```
+will load the Tacotron2 model pre-trained on [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/)
+
+### Model Description
+
+The Tacotron 2 and WaveGlow model form a text-to-speech system that enables user to synthesise a natural sounding speech from raw transcripts without any additional prosody information. The Tacotron 2 model produces mel spectrograms from input text using encoder-decoder architecture. WaveGlow (also available via torch.hub) is a flow-based model that consumes the mel spectrograms to generate speech.
+
+This implementation of Tacotron 2 model differs from the model described in the paper. Our implementation uses Dropout instead of Zoneout to regularize the LSTM layers.
+
+### Example
+
+In the example below:
+- pretrained Tacotron2 and Waveglow models are loaded from torch.hub
+- Tacotron2 generates mel spectrogram given tensor represantation of an input text ("Hello world, I missed you")
+- Waveglow generates sound given the mel spectrogram
+- the output sound is saved in an 'audio.wav' file
+
+To run the example you need some extra python packages installed.
+These are needed for preprocessing the text and audio, as well as for display and input / output.
+
+```bash
+pip install numpy scipy librosa unidecode inflect librosa
+```
+
+Now, let's make the model say *"hello world, I missed you"*
+
+```python
+text = "hello world, I missed you"
+```
+
+```python
+import numpy as np
+from scipy.io.wavfile import write
+```
+
+Prepare tacotron2 for inference
+
+```python
+tacotron2 = tacotron2.to('cuda')
+tacotron2.eval()
+```
+
+Load waveglow from PyTorch Hub
+
+```python
+waveglow = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_waveglow')
+waveglow = waveglow.remove_weightnorm(waveglow)
+waveglow = waveglow.to('cuda')
+waveglow.eval()
+```
+
+Now chain pre-processing -> tacotron2 -> waveglow
+
+```python
+# preprocessing
+sequence = np.array(tacotron2.text_to_sequence(text, ['english_cleaners']))[None, :]
+sequence = torch.from_numpy(sequence).to(device='cuda', dtype=torch.int64)
+
+# run the models
+with torch.no_grad():
+    _, mel, _, _ = tacotron2.infer(sequence)
+    audio = waveglow.infer(mel)
+audio_numpy = audio[0].data.cpu().numpy()
+rate = 22050
+```
+
+You can write it to a file and listen to it
+
+```python
+write("audio.wav", rate, audio_numpy)
+```
+
+
+Alternatively, play it right away in a notebook with IPython widgets
+
+```python
+from IPython.display import Audio
+Audio(audio_numpy, rate=rate)
+```
+
+### Details
+For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)
+
+### References
+
+ - [Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions](https://arxiv.org/abs/1712.05884)
+ - [WaveGlow: A Flow-based Generative Network for Speech Synthesis](https://arxiv.org/abs/1811.00002)
+ - [Tacotron2 and WaveGlow on NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)
+ - [Tacotron2 and Waveglow on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2)
diff --git a/nvidia_deeplearningexamples_waveglow.md b/nvidia_deeplearningexamples_waveglow.md
@@ -0,0 +1,107 @@
+---
+layout: hub_detail
+background-class: hub-background
+body-class: hub
+title: WaveGlow
+summary: WaveGlow model for generating speech from mel spectrograms (generated by Tacotron2)
+category: researchers
+image: nvidia_logo.png
+author: NVIDIA
+tags: [audio]
+github-link: https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2
+featured_image_1: waveglow_diagram.png
+featured_image_2: no-image
+accelerator: cuda
+order: 10
+---
+
+```python
+import torch
+waveglow = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_waveglow')
+```
+will load the WaveGlow model pre-trained on [LJ Speech dataset](https://keithito.com/LJ-Speech-Dataset/)
+
+### Model Description
+
+The Tacotron 2 and WaveGlow model form a text-to-speech system that enables user to synthesise a natural sounding speech from raw transcripts without any additional prosody information. The Tacotron 2 model (also available via torch.hub) produces mel spectrograms from input text using encoder-decoder architecture. WaveGlow is a flow-based model that consumes the mel spectrograms to generate speech.
+
+### Example
+
+In the example below:
+- pretrained Tacotron2 and Waveglow models are loaded from torch.hub
+- Tacotron2 generates mel spectrogram given tensor represantation of an input text ("Hello world, I missed you")
+- Waveglow generates sound given the mel spectrogram
+- the output sound is saved in an 'audio.wav' file
+
+To run the example you need some extra python packages installed.
+These are needed for preprocessing the text and audio, as well as for display and input / output.
+
+```bash
+pip install numpy scipy librosa unidecode inflect librosa
+```
+
+Now, let's make the model say *"hello world, I missed you"*
+
+```python
+text = "hello world, I missed you"
+```
+
+```python
+import numpy as np
+from scipy.io.wavfile import write
+```
+
+Prepare the waveglow model for inference
+
+```python
+waveglow = waveglow.remove_weightnorm(waveglow)
+waveglow = waveglow.to('cuda')
+waveglow.eval()
+```
+
+Load tacotron2 from PyTorch Hub
+
+```python
+tacotron2 = torch.hub.load('nvidia/DeepLearningExamples', 'nvidia_tacotron2')
+tacotron2 = tacotron2.to('cuda')
+tacotron2.eval()
+```
+
+Now chain pre-processing -> tacotron2 -> waveglow
+
+```python
+# preprocessing
+sequence = np.array(tacotron2.text_to_sequence(text, ['english_cleaners']))[None, :]
+sequence = torch.from_numpy(sequence).to(device='cuda', dtype=torch.int64)
+
+# run the models
+with torch.no_grad():
+    _, mel, _, _ = tacotron2.infer(sequence)
+    audio = waveglow.infer(mel)
+audio_numpy = audio[0].data.cpu().numpy()
+rate = 22050
+```
+
+You can write it to a file and listen to it
+
+```python
+write("audio.wav", rate, audio_numpy)
+```
+
+
+Alternatively, play it right away in a notebook with IPython widgets
+
+```python
+from IPython.display import Audio
+Audio(audio_numpy, rate=rate)
+```
+
+### Details
+For detailed information on model input and output, training recipies, inference and performance visit: [github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2) and/or [NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)
+
+### References
+
+ - [Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions](https://arxiv.org/abs/1712.05884)
+ - [WaveGlow: A Flow-based Generative Network for Speech Synthesis](https://arxiv.org/abs/1811.00002)
+ - [Tacotron2 and WaveGlow on NGC](https://ngc.nvidia.com/catalog/model-scripts/nvidia:tacotron_2_and_waveglow_for_pytorch)
+ - [Tacotron2 and Waveglow on github](https://github.com/NVIDIA/DeepLearningExamples/tree/master/PyTorch/SpeechSynthesis/Tacotron2)
diff --git a/pytorch_vision_alexnet.md b/pytorch_vision_alexnet.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/alexnet.py
 featured_image_1: alexnet1.png
 featured_image_2: alexnet2.png
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_deeplabv3_resnet101.md b/pytorch_vision_deeplabv3_resnet101.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/segmentation/deeplabv3.py
 featured_image_1: deeplab1.png
 featured_image_2: deeplab2.png
+accelerator: cuda-optional
+order: 1
 ---
 
 ```python

diff --git a/pytorch_vision_densenet.md b/pytorch_vision_densenet.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/densenet.py
 featured_image_1: densenet1.png
 featured_image_2: densenet2.png
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_fcn_resnet101.md b/pytorch_vision_fcn_resnet101.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/segmentation/fcn.py
 featured_image_1: deeplab1.png
 featured_image_2: fcn2.png
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_googlenet.md b/pytorch_vision_googlenet.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/googlenet.py
 featured_image_1: googlenet1.png
 featured_image_2: googlenet2.png
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_inception_v3.md b/pytorch_vision_inception_v3.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/inception.py
 featured_image_1: inception_v3.png
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_mobilenet_v2.md b/pytorch_vision_mobilenet_v2.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/mobilenet.py
 featured_image_1: mobilenet_v2_1.png
 featured_image_2: mobilenet_v2_2.png
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_resnet.md b/pytorch_vision_resnet.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py
 featured_image_1: resnet.png
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_resnext.md b/pytorch_vision_resnext.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py
 featured_image_1: resnext.png
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_shufflenet_v2.md b/pytorch_vision_shufflenet_v2.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/shufflenetv2.py
 featured_image_1: shufflenet_v2_1.png
 featured_image_2: shufflenet_v2_2.png
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_squeezenet.md b/pytorch_vision_squeezenet.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/squeezenet.py
 featured_image_1: squeezenet.png
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python

diff --git a/pytorch_vision_vgg.md b/pytorch_vision_vgg.md
@@ -11,6 +11,8 @@ tags: [vision]
 github-link: https://github.com/pytorch/vision/blob/master/torchvision/models/vgg.py
 featured_image_1: vgg.png
 featured_image_2: no-image
+accelerator: cuda-optional
+order: 10
 ---
 
 ```python