Skip to content

Commit

Permalink
readme for dynamic_selection
Browse files Browse the repository at this point in the history
  • Loading branch information
jaychoi12 committed Jun 2, 2021
1 parent b093423 commit 421a6d1
Show file tree
Hide file tree
Showing 6 changed files with 46 additions and 61 deletions.
105 changes: 45 additions & 60 deletions dynamic_selection/README.md
Original file line number Diff line number Diff line change
@@ -1,88 +1,73 @@
# Sample-Selection Approaches and Conjunction with Noise-Robust functions
This is a PyTorch implementation for the sample-selection approaches and conjunction with noise-robust functions.
This is a PyTorch implementation for the sample-selection approaches and collaboration with noise-robust functions.

## Dataset
### CIFAR10, CIFAR100
You don't have to take care about these dataset. Download options of these datasets are included in the codes.

### Clothing1M
You have to download Clothing1M dataset and set its path before run the codes.
- To download the dataset, follow https://github.com/Cysu/noisy_label
- Directories and Files of clothing1m should be saved in `dir_to_data/clothing1m`. The directory structure should be

dynamic_selection/dir_to_data/clothing1m/
├── 0/
├── ⋮
├── 9/
├── annotations/
├── category_names_chn.txt
├── category_names_eng.txt
├── clean_train_key_list.txt
├── clean_val_key_list.txt
├── clean_test_key_list.txt
├── clean_label_kv.txt
├── noisy_train_key_list.txt
└── noisy_label_kv.txt

- Directories `0/` to `9/` include image data.

## Usage
You can check simple descriptions about arguments in `utils/args.py`.
According to the descriptions, the arguments can be replaced.

All the bash samples below run the code with `60% symmetric noise`, `cifar-10 dataset` and `ResNet-34` architecture.
All the bashes below run the code with `60% symmetric noise`, `cifar-10 dataset` and `ResNet-34` architecture.
You can change arguments settings according to its descriptions.

### FINE as robust approach (Sec. 4.2.3)
### Sample-Selection based Approaches (Sec. 4.2.1)

Dynamically apply FINE algorithm in the training process.
Bash files for this section is in `scripts/dynamic/` directory.
To run our FINE algorithm, the FINE detector dynamically select the clean data at every epoch, and then the neural network are trained with them

```
bash scripts/dynamic/FINE_ce_dynamic.sh
bash scripts/sample_selection_based/fine_cifar.sh
bash scripts/dynamic/FINE_gce_dynamic.sh
bash scripts/dynamic/FINE_sce_dynamic.sh
bash scripts/dynamic/FINE_elr_dynamic.sh
bash scripts/sample_selection_based/fine_clothing1m.sh
```
- You can change cifar10 or cifar100 option in `fine_cifar.sh`
- Clothing1m dataset have to be set before run `fine_clothing1m.sh`

<<<<<<< HEAD

### FINE as application (Sec. 4.3) ; Robust Loss Approach
After train the proxy network, make clean dataset to train the target network by using the trained proxy network.
This bash file also contains commands that train proxy networks and load the networks to train target networks.
To change the settings for experiments, the path of the second command's arguments must also be modified.
To run `F-coteaching` experiment, substituting sample selection state of Co-teaching to our FINE algorithm, just follow

```
bash scripts/retrain/FINE_retrain.sh
bash scripts/sample_selection_based/f-coteaching.sh
```

### FINE as application (Sec. 4.3) ; Sample Selection Approach
Substituting sample selection state of Co-teaching to our FINE algorithm.

```
bash scripts/coteaching/f-coteaching.sh
```
### Collaboration with Noise-Robust Loss Functions (Sec. 4.2.3)

### Robust loss functions
Train the proxy network on the Symmmetric Noise CIFAR-10 dataset, ResNet18, ELR Loss (noise rate = 0.8):
These commands run our FINE algorithm with various robust loss function methods.
We used Cross Entropy (CE), Generalized Cross Entropy (GCE), Symmetric Cross Entropy (SCE), and Early-Learning Regularized (ELR).

```
python train.py -c ./hyperparams/multistep/config_cifar10_elr_rn18.json --percent=0.8 --asym=False
```
bash scripts/robust_loss/fine_dynamic_ce.sh
Train the proxy network on the Asymmmetric Noise CIFAR-100 dataset, ResNet34, GCE Loss (noise rate = 0.4):

```
python train.py -c ./hyperparams/cosine/config_cifar100_gce_rn34.json --percent=0.4 --asym=True
```
bash scripts/robust_loss/fine_dynamic_gce.sh
If you want to train the network with SAME CIFAR-10 dataset, GCELoss, ResNet34(noise rate = 0.8, Symmetric Noise),
bash scripts/robust_loss/fine_dynamic_sce.sh
bash scripts/robust_loss/fine_dynamic_elr.sh
```
train.py -c ./hyperparams/multistep/config_cifar10_gce_rn34.json -d 0 --percent 0.8 --distillation --distill_mode=eigen
--load_name=multistep_sym_80_gce.pth --reinit
```
To load the checkpoint for `--load_name`, you should manually make the `checkpoint` folder and put the `xx.pth` file into the `checkpoint` folder.
(`xx.pth` will be saved in the `saved` directory and its log is in the `logger`.)
The config files can be modified to adjust hyperparameters and optimization settings.

### Co-teaching families
Train the network on the Symmetric Noise CIFAR-10 dataset (noise rate = 0.8). Fix lr_scheduler argument as coteach
You can choose four arguments for loss_fn (coteach, coteach+, coteachdistill, coteach+distill)
```
python train_coteaching.py --arch=rn18 --loss_fn=coteach --lr_scheduler=coteach --num_gradual=60 --asym=False --percent=0.8 --no_wandb
python train.py --device 0 --config hyperparams/coteach/config_cifar10_coteach_rn18.json --asym=False --percent=0.8 --no_wandb
```
not uses wandb, just run on your local(server)
<!--
Train the networ with the CLK or SAME with coteachdistill or coteach+distill (noise rate = 0.8)
```
python train_coteaching.py --distillation --reinit --distill_mode kmeans --arch=rn18 --asym=False --dataset=cifar100 --loss_fn=coteach+distill --lr_scheduler=coteach --num_gradual=140 --percent=0.8
```
or
```
python train_coteaching.py --distillation --reinit --distill_mode eigen --arch=rn18 --asym=False --dataset=cifar100 --loss_fn=coteachdistill --lr_scheduler=coteach --num_gradual=60 --percent=0.8
```
To load the checkpoint for `--load_name`, you should manually make the `checkpoint` folder and put the `xx.pth` file into the `checkpoint` folder.
(`xx.pth` will be saved in the `saved` directory and its log is in the `logger`.)
## arguments
if dataset, loss_fn, lr_scheduler are all given, don't have to give config file as an argument.
if config file is given, dataset, loss_fn, lr_scheduler arguments are useless.
Expand Down Expand Up @@ -134,4 +119,4 @@ usage : python train_coteaching.py [-c] [-d] [--distillation] [--distill_mode] [
--no_wandb : whether or not using wandb (if you do not use wandb, state --no_wandb)
--reinit : whether or not re-initialization network parameters
--load_name : checkpoint directory for proxy network
```
``` -->
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
#!/bin/bash

python main.py -d 1 --asym false --percent 0.2 --lr_scheduler multistep --arch rn34 --loss_fn cce --dataset cifar10 --traintools coteaching --no_wandb --distill_mode fine-gmm --dynamic --dataseed 123 --every 10 --warmup 40
python main.py -d 1 --asym false --percent 0.6 --lr_scheduler multistep --arch rn34 --loss_fn cce --dataset cifar10 --traintools coteaching --no_wandb --distill_mode fine-gmm --dynamic --dataseed 123 --every 10 --warmup 40

0 comments on commit 421a6d1

Please sign in to comment.