Sparse_categorical_crossentropy vs categorical_crossentropy (keras, accuracy)

Question

Which is better for accuracy or are they the same? Of course, if you use categorical_crossentropy you use one hot encoding, and if you use sparse_categorical_crossentropy you encode as normal integers. Additionally, when is one better than the other?

featuredpeow · Accepted Answer · 2019-09-14T13:54:26.427

Use sparse categorical crossentropy when your classes are mutually exclusive (e.g. when each sample belongs exactly to one class) and categorical crossentropy when one sample can have multiple classes or labels are soft probabilities (like [0.5, 0.3, 0.2]).

Formula for categorical crossentropy (S - samples, C - classess, $s \in c $ - sample belongs to class c) is:

$$ -\frac{1}{N} \sum_{s\in S} \sum_{c \in C} 1_{s\in c} log {p(s \in c)} $$

For case when classes are exclusive, you don't need to sum over them - for each sample only non-zero value is just $-log p(s \in c)$ for true class c.

This allows to conserve time and memory. Consider case of 10000 classes when they are mutually exclusive - just 1 log instead of summing up 10000 for each sample, just one integer instead of 10000 floats.

Formula is the same in both cases, so no impact on accuracy should be there.

score 39 · Answer 2 · edited Feb 07 '21 at 21:06

39

The answer, in a nutshell

If your targets are one-hot encoded, use categorical_crossentropy. Examples of one-hot encodings:

[1,0,0]
[0,1,0] 
[0,0,1]

But if your targets are integers, use sparse_categorical_crossentropy. Examples of integer encodings (for the sake of completion):

1
2
3

edited Feb 07 '21 at 21:06

Ethan

1,657
9
25
39

answered Jul 19 '19 at 09:32

user78035

391
3
2

Sparse_categorical_crossentropy vs categorical_crossentropy (keras, accuracy)

2 Answers2