# Blog

## 5000+ Members

### MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community.

### JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

### CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

# [D] Minimum cost is not zero when calculating cross-entropy on soft labels

I am training a neural network using batches of soft labels, e.g.

``y = [[0.00, 0.25, 0.25, 0.50], ... [0.75, 0.00, 0.20, 0.05]] ``

However, as opposed to one-hot labels, if the softmax activation function outputs a list ŷ equal to y (no loss), as in

``y = ŷ = [0.00, 0.25, 0.25, 0.50] ``

the cross-entropy function is not 0:

``loss = -sum(y * log(ŷ)) = 1.0397 ``

although it is true that with no other ŷ we can reach a lower value, given y.

Then, the more sparse y is, the larger is the minimum possible loss:

``y = ŷ = [0.25, 0.25, 0.25, 0.25] loss = -sum(y * log(ŷ)) = 1.3862 ``

So my question is, would this lower bound in the minimum possible loss constitute a bias when training/testing a neural network? Since a neural network yields a higher minimum cost for more sparse soft labels than for less sparse (up to one-hot) labels, maybe the network adjusts the weights and biases towards a way to minimize the more sparse soft labels, in detriment of the less sparse soft and one-hot labels?

submitted by /u/vratiner
[link] [comments]