forked from microsoft/nni
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add SPOS docs and improve NAS doc structure (microsoft#1907)
* darts mutator docs * fix docs * update * add docs for SPOS * index SPOS * restore workers
- Loading branch information
Yuge Zhang
authored
Dec 31, 2019
1 parent
31f545e
commit c993f76
Showing
19 changed files
with
395 additions
and
170 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,18 +1,50 @@ | ||
# DARTS on NNI | ||
# DARTS | ||
|
||
## Introduction | ||
|
||
The paper [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Their method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent | ||
The paper [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Their method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. | ||
|
||
To implement, authors optimize the network weights and architecture weights alternatively in mini-batches. They further explore the possibility that uses second order optimization (unroll) instead of first order, to improve the performance. | ||
Authors' code optimizes the network weights and architecture weights alternatively in mini-batches. They further explore the possibility that uses second order optimization (unroll) instead of first order, to improve the performance. | ||
|
||
Implementation on NNI is based on the [official implementation](https://github.com/quark0/darts) and a [popular 3rd-party repo](https://github.com/khanrc/pt.darts). So far, first and second order optimization and training from scratch on CIFAR10 have been implemented. | ||
Implementation on NNI is based on the [official implementation](https://github.com/quark0/darts) and a [popular 3rd-party repo](https://github.com/khanrc/pt.darts). DARTS on NNI is designed to be general for arbitrary search space. A CNN search space tailored for CIFAR10, same as the original paper, is implemented as a use case of DARTS. | ||
|
||
## Reproduce Results | ||
## Reproduction Results | ||
|
||
To reproduce the results in the paper, we do experiments with first and second order optimization. Due to the time limit, we retrain *only the best architecture* derived from the search phase and we repeat the experiment *only once*. Our results is currently on par with the results reported in paper. We will add more results later when ready. | ||
The above-mentioned example is meant to reproduce the results in the paper, we do experiments with first and second order optimization. Due to the time limit, we retrain *only the best architecture* derived from the search phase and we repeat the experiment *only once*. Our results is currently on par with the results reported in paper. We will add more results later when ready. | ||
|
||
| | In paper | Reproduction | | ||
| ---------------------- | ------------- | ------------ | | ||
| First order (CIFAR10) | 3.00 +/- 0.14 | 2.78 | | ||
| Second order (CIFAR10) | 2.76 +/- 0.09 | 2.89 | | ||
|
||
## Examples | ||
|
||
### CNN Search Space | ||
|
||
[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/darts) | ||
|
||
```bash | ||
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder. | ||
git clone https://github.com/Microsoft/nni.git | ||
|
||
# search the best architecture | ||
cd examples/nas/darts | ||
python3 search.py | ||
|
||
# train the best architecture | ||
python3 retrain.py --arc-checkpoint ./checkpoints/epoch_49.json | ||
``` | ||
|
||
## Reference | ||
|
||
### PyTorch | ||
|
||
```eval_rst | ||
.. autoclass:: nni.nas.pytorch.darts.DartsTrainer | ||
:members: | ||
.. automethod:: __init__ | ||
.. autoclass:: nni.nas.pytorch.darts.DartsMutator | ||
:members: | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,46 @@ | ||
# ENAS on NNI | ||
# ENAS | ||
|
||
## Introduction | ||
|
||
The paper [Efficient Neural Architecture Search via Parameter Sharing](https://arxiv.org/abs/1802.03268) uses parameter sharing between child models to accelerate the NAS process. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss. | ||
|
||
Implementation on NNI is based on the [official implementation in Tensorflow](https://github.com/melodyguan/enas), macro and micro search space on CIFAR10 included. Since code to train from scratch on NNI is not ready yet, reproduction results are currently unavailable. | ||
Implementation on NNI is based on the [official implementation in Tensorflow](https://github.com/melodyguan/enas), including a general-purpose Reinforcement-learning controller and a trainer that trains target network and this controller alternatively. Following paper, we have also implemented macro and micro search space on CIFAR10 to demonstrate how to use these trainers. Since code to train from scratch on NNI is not ready yet, reproduction results are currently unavailable. | ||
|
||
## Examples | ||
|
||
### CIFAR10 Macro/Micro Search Space | ||
|
||
[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/enas) | ||
|
||
```bash | ||
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder. | ||
git clone https://github.com/Microsoft/nni.git | ||
|
||
# search the best architecture | ||
cd examples/nas/enas | ||
|
||
# search in macro search space | ||
python3 search.py --search-for macro | ||
|
||
# search in micro search space | ||
python3 search.py --search-for micro | ||
|
||
# view more options for search | ||
python3 search.py -h | ||
``` | ||
|
||
## Reference | ||
|
||
### PyTorch | ||
|
||
```eval_rst | ||
.. autoclass:: nni.nas.pytorch.enas.EnasTrainer | ||
:members: | ||
.. automethod:: __init__ | ||
.. autoclass:: nni.nas.pytorch.enas.EnasMutator | ||
:members: | ||
.. automethod:: __init__ | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# P-DARTS | ||
|
||
## Examples | ||
|
||
[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/pdarts) | ||
|
||
```bash | ||
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder. | ||
git clone https://github.com/Microsoft/nni.git | ||
|
||
# search the best architecture | ||
cd examples/nas/pdarts | ||
python3 search.py | ||
|
||
# train the best architecture, it's the same progress as darts. | ||
cd ../darts | ||
python3 retrain.py --arc-checkpoint ../pdarts/checkpoints/epoch_2.json | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# Single Path One-Shot (SPOS) | ||
|
||
## Introduction | ||
|
||
Proposed in [Single Path One-Shot Neural Architecture Search with Uniform Sampling](https://arxiv.org/abs/1904.00420) is a one-shot NAS method that addresses the difficulties in training One-Shot NAS models by constructing a simplified supernet trained with an uniform path sampling method, so that all underlying architectures (and their weights) get trained fully and equally. An evolutionary algorithm is then applied to efficiently search for the best-performing architectures without any fine tuning. | ||
|
||
Implementation on NNI is based on [official repo](https://github.com/megvii-model/SinglePathOneShot). We implement a trainer that trains the supernet and a evolution tuner that leverages the power of NNI framework that speeds up the evolutionary search phase. We have also shown | ||
|
||
## Examples | ||
|
||
Here is a use case, which is the search space in paper, and the way to use flops limit to perform uniform sampling. | ||
|
||
[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/spos) | ||
|
||
### Requirements | ||
|
||
NVIDIA DALI >= 0.16 is needed as we use DALI to accelerate the data loading of ImageNet. [Installation guide](https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/docs/installation.html) | ||
|
||
Download the flops lookup table from [here](https://1drv.ms/u/s!Am_mmG2-KsrnajesvSdfsq_cN48?e=aHVppN) (maintained by [Megvii](https://github.com/megvii-model)). | ||
Put `op_flops_dict.pkl` and `checkpoint-150000.pth.tar` (if you don't want to retrain the supernet) under `data` directory. | ||
|
||
Prepare ImageNet in the standard format (follow the script [here](https://gist.github.com/BIGBALLON/8a71d225eff18d88e469e6ea9b39cef4)). Linking it to `data/imagenet` will be more convenient. | ||
|
||
After preparation, it's expected to have the following code structure: | ||
|
||
``` | ||
spos | ||
├── architecture_final.json | ||
├── blocks.py | ||
├── config_search.yml | ||
├── data | ||
│ ├── imagenet | ||
│ │ ├── train | ||
│ │ └── val | ||
│ └── op_flops_dict.pkl | ||
├── dataloader.py | ||
├── network.py | ||
├── readme.md | ||
├── scratch.py | ||
├── supernet.py | ||
├── tester.py | ||
├── tuner.py | ||
└── utils.py | ||
``` | ||
|
||
### Step 1. Train Supernet | ||
|
||
``` | ||
python supernet.py | ||
``` | ||
|
||
Will export the checkpoint to `checkpoints` directory, for the next step. | ||
|
||
NOTE: The data loading used in the official repo is [slightly different from usual](https://github.com/megvii-model/SinglePathOneShot/issues/5), as they use BGR tensor and keep the values between 0 and 255 intentionally to align with their own DL framework. The option `--spos-preprocessing` will simulate the behavior used originally and enable you to use the checkpoints pretrained. | ||
|
||
### Step 2. Evolution Search | ||
|
||
Single Path One-Shot leverages evolution algorithm to search for the best architecture. The tester, which is responsible for testing the sampled architecture, recalculates all the batch norm for a subset of training images, and evaluates the architecture on the full validation set. | ||
|
||
In order to make the tuner aware of the flops limit and have the ability to calculate the flops, we created a new tuner called `EvolutionWithFlops` in `tuner.py`, inheriting the tuner in SDK. | ||
|
||
To have a search space ready for NNI framework, first run | ||
|
||
``` | ||
nnictl ss_gen -t "python tester.py" | ||
``` | ||
|
||
This will generate a file called `nni_auto_gen_search_space.json`, which is a serialized representation of your search space. | ||
|
||
By default, it will use `checkpoint-150000.pth.tar` downloaded previously. In case you want to use the checkpoint trained by yourself from the last step, specify `--checkpoint` in the command in `config_search.yml`. | ||
|
||
Then search with evolution tuner. | ||
|
||
``` | ||
nnictl create --config config_search.yml | ||
``` | ||
|
||
The final architecture exported from every epoch of evolution can be found in `checkpoints` under the working directory of your tuner, which, by default, is `$HOME/nni/experiments/your_experiment_id/log`. | ||
|
||
### Step 3. Train from Scratch | ||
|
||
``` | ||
python scratch.py | ||
``` | ||
|
||
By default, it will use `architecture_final.json`. This architecture is provided by the official repo (converted into NNI format). You can use any architecture (e.g., the architecture found in step 2) with `--fixed-arc` option. | ||
|
||
## Reference | ||
|
||
### PyTorch | ||
|
||
```eval_rst | ||
.. autoclass:: nni.nas.pytorch.spos.SPOSEvolution | ||
:members: | ||
.. automethod:: __init__ | ||
.. autoclass:: nni.nas.pytorch.spos.SPOSSupernetTrainer | ||
:members: | ||
.. automethod:: __init__ | ||
.. autoclass:: nni.nas.pytorch.spos.SPOSSupernetTrainingMutator | ||
:members: | ||
.. automethod:: __init__ | ||
``` | ||
|
||
## Known Limitations | ||
|
||
* Block search only. Channel search is not supported yet. | ||
* Only GPU version is provided here. | ||
|
||
## Current Reproduction Results | ||
|
||
Reproduction is still undergoing. Due to the gap between official release and original paper, we compare our current results with official repo (our run) and paper. | ||
|
||
* Evolution phase is almost aligned with official repo. Our evolution algorithm shows a converging trend and reaches ~65% accuracy at the end of search. Nevertheless, this result is not on par with paper. For details, please refer to [this issue](https://github.com/megvii-model/SinglePathOneShot/issues/6). | ||
* Retrain phase is not aligned. Our retraining code, which uses the architecture released by the authors, reaches 72.14% accuracy, still having a gap towards 73.61% by official release and 74.3% reported in original paper. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
[Documentation](https://nni.readthedocs.io/en/latest/NAS/DARTS.html) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
[Documentation](https://nni.readthedocs.io/en/latest/NAS/ENAS.html) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
This is a naive example that demonstrates how to use NNI interface to implement a NAS search space. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
[Documentation](https://nni.readthedocs.io/en/latest/NAS/PDARTS.html) |
Oops, something went wrong.