Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
I am training a neural network using batches of soft labels, e.g.
y = [[0.00, 0.25, 0.25, 0.50], ... [0.75, 0.00, 0.20, 0.05]]
However, as opposed to one-hot labels, if the softmax activation function outputs a list ŷ equal to y (no loss), as in
y = ŷ = [0.00, 0.25, 0.25, 0.50]
the cross-entropy function is not 0:
loss = -sum(y * log(ŷ)) = 1.0397
although it is true that with no other ŷ we can reach a lower value, given y.
Then, the more sparse y is, the larger is the minimum possible loss:
y = ŷ = [0.25, 0.25, 0.25, 0.25] loss = -sum(y * log(ŷ)) = 1.3862
So my question is, would this lower bound in the minimum possible loss constitute a bias when training/testing a neural network? Since a neural network yields a higher minimum cost for more sparse soft labels than for less sparse (up to one-hot) labels, maybe the network adjusts the weights and biases towards a way to minimize the more sparse soft labels, in detriment of the less sparse soft and one-hot labels?
submitted by /u/vratiner
[link] [comments]