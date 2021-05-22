



This is the treatise code:

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels Lu Jiang, Di Huang, Mason Liu, Weilong Yang Announced on ICML2020, Slides

Please note that this is not an officially supported Google product. This is a duplicate, not the original code.

If you find this code useful for your research, please cite it

@inproceedings {jiang2020beyond, title = {Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels}, author = {Jiang, Lu and Huang, Di and Liu, Mason and Yang, Weilong}, booktitle = {ICML}, year = {2020 }} Preface

Given a noisy dataset with an unknown noise level, our goal is to train a robust model that can properly generalize clean test data. Here’s a simple and effective method called Mentor Mix. Existing methods that work well with synthetic noise may not work well with real noisy labels, but we show that our method overcomes both synthetic and real noisy labels. ..

It also releases the first benchmark of controlled real-world label noise from the web. Check out this dataset at this link. If you have questions about datasets or methods,[問題]Please leave it on the tab.

algorithm

MentorMix is ​​inspired by MentorNet (for curriculum learning) and Mixup (for minimizing adjacency risk).

MentorMix consists of four steps: weight, sample, mixing, and weight. The animation below illustrates these steps. MentorNet is used to calculate the weight of a sample, and in the simplest case MentorNet can replace the loss of a sample with a simple threshold function that compares the loss of the sample with the loss moving average (loss p percentile). See loss_thresholding_function in utils.py. In addition, the second weighting was found to be useful for high noise levels.

MentorMix tests on 5 datasets including CIFAR 10/100 with synthetic label noise and WebVision 1.0. It achieves the best published results in the complete WebVision dataset and significantly improves the previous best method for precision @ 1 in the ImageNet ILSVRC12 validation set by about 3%.

setup

All code was developed and tested on Nvidia V100 (16GB) in the following environment:

Ubuntu 18.04 Python 2.7.15 TensorFlow 1.15.0 numpy 1.13.3

Next, you need to download the CIFAR dataset. Place them in the same directory as your code directory: data.

Run MentorMix in CIFAR

Cifar100 40% noise:

nohup python code / cifar_train_mentormix.py –batch_size = 128 –dataset_name = cifar100 –trained_mentornet_dir = mentornet_models / mentornet_pd –loss_p_percentile = 0.5 –burn_in_epoch = 10 –data_dir = data / cifar100 / 0.4 –train_log_dir = cifar100_models / resnet32 / 0.4 / mentormix_p05_a8 / train –studentnet = resnet32 –max_number_of_steps = 20000 –device_id = 0 –num_epochs_per_decay = 30 –mixup_alpha = 8.0> train_mentormix_p05_a8.

The training script has two very important command line flags for setting hyperparameters.

–mixup_alpha: Hyperparameters used in beta distribution. –loss_p_percentile: Hyperparameter p percentile used to calculate the loss moving average.

For eval, do the following:

EXPDIR = “cifar100_models / resnet32 / 0.4 / mentormix_p05_a8” nohup python code / cifar_eval.py –dataset_name = cifar100 –data_dir = data / cifar100 / val / –checkpoint_dir = “$ {EXPDIR} / train” – -eval_dir = “$ {EXPDIR} / eval_val” –studentnet = resnet32 –device_id = 1> $ (basename “$ EXPDIR” .txt) & Performance on CIFAR10 and CIFAR100

This is a reimplementation of asynchronous multi-GPU training on a single GPU using a third party library. Due to the small number of training steps, this number may differ slightly from the internal number reported in the paper.

CIFAR-100 ResNet-32

noise_fraction mentormix 0.2 0.778 0.4 0.704 0.6 0.660 0.8 0.392

CIFAR-10 ResNet-32

noise_fraction mentormix 0.2 0.962 0.4 0.944 0.6 0.890 0.80.820 Practical recommendations

Based on the findings of the paper, we have the following practical recommendations for training deep neural networks with noisy data.

An easy way to handle noisy labels is to tweak a pre-trained model. The better the pre-trained model, the more likely it is to be generalized for downstream noisy training tasks. Early stop may have no effect on the actual label noise from the web. The actual label noise from the web looks less harmful, but it’s more difficult to deal with. The way it works with synthetic noise may not work with the actual noisy labels from the web. The proposed Mentor Mix overcomes both synthetic and real-world noise labels.

