Why is a general/original softmax loss not preferred in FR (face recognition)?

Question

In some papers I've read that softmax loss is not preferred in FR since it does not give a good inter-class and intra-class margins, but could not understand 'why?'. So can someone explain, why softmax loss is not preferred in FR, in both mathematically and theoretically.

score 1 · Answer 1 · answered Nov 28 '21 at 12:59

Disadvantage of softmax loss is written in Your referenced paper.

"ArcFace" (arxiv.org/pdf/1801.07698.pdf) and "Face recognition via centralized coordinate learning" https://arxiv.org/pdf/1801.05678.pdf

(1) the size of the linear transformation matrix W ∈ Rd×n increases linearly with the identities number n;

there are millions of identities in the training data. Complexity will grow too much.

(2) the learned features are separable for the closed-set classification problem but not discriminative enough for the open-set face recognition problem.

In an open-set problem, unknown classes may occur in the test stage. In a close-set problem, all test classes are known in the training stage. face recognition is open-set problem.

Why is a general/original softmax loss not preferred in FR (face recognition)?

1 Answers1