Blog

Learn About Our Meetup

4200+ Members

[R] CVPR 2019 Noise-Tolerant Training work `Learning to Learn from Noisy Labeled Data ‘

https://arxiv.org/pdf/1812.05214.pdf

This work achieves promising results with meta-learning. Our result on Clothing 1M is comparable with theirs. However, their modelling via meta-learning seems extremely complex in practice.

Too many hyper-parameters shown in their Algorithm 1 and implementation section 4.2:

  1. The number of synthetic mini-batches (meta-training iterations) M;
  2. Meta-training step size alpha;
  3. Meta-learning rate eta;
  4. Student learning rate beta;
  5. Exponential moving average (EMA) decay gamma;
  6. The threshold for data filtering tau;
  7. The number of samples with label replacement, rho;

The strategies of iterative training together with iterative data filtering/cleaning, reusing last-round best model as mentor, etc., make it difficult to handle in practice.

However, the ideas are interesting and novel:

  1. Oracle/Mentor (Consistency loss): To make meta-test reliable, the teacher/mentor model should be reliable and robust to real noisy examples. Therefore, they apply iterative training and iterative data cleaning to make the meta-test consistency loss reliable and an optimisation oracle against real noise.
  2. Unaffected by synthetic noise: The meta-training sees synthetic noisy training examples. After training on them, the meta-testing evaluates its consistency with oracle and aims to maximise the consistency, i.e., making it unaffected after seeing synthetic noise.

Quetions arise:

Is meta-learning really a good solution in practice with such many configurations?

Or could we simplfiy its modelling to make it easier in practice?

submitted by /u/XinshaoWang
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat