Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D]I’m trying to implement ‘Born Again Neural Networks’ by T.Furlanello.

Hi, I’m trying to implement ‘Born Again Neural Networks(BAN)’ by T.Furlanello(https://arxiv.org/abs/1805.04770), and I have some questions. Can anybody help with this please?

If you have read some papers on Knowledge Distillation, you would know that some papers released before BAN used so called temperature. In those papers, the logits were divided with temperature(usually positive integer) and then these outputs were gone through within softmax.

In BAN paper, however, the authour didn’t mention on temperature. So I didn’t stabilize the logits (in other words, I set temperature as 1), but I found that I failed. There were no dramatic difference between Original network and Distilled Network. I guess if i just set temperature as 1 and if the networks overfits, the output distribution wouldn’t provide a meaningful ‘dark knowledge’..

for example, there would no difference between [1, 0, 0, 0,0] and [0.999, 0.000 …, 0.000 …, 0.000 …, 0.000 …]

so.. do i have to apply temperature even though the author didn’t mention on the paper? I’m wondering if I can succeed without applying temperature. Thank you for reading.

submitted by /u/crackitr
[link] [comments]