12

Is it proven that all 15M images were manually classified correctly and there are no mistakes or randomly selected responses collected?

ivan866
  • 220
  • 2
  • 7

2 Answers2

21

It was actually shown that imagenet and many other famous datasets have lot's of errors in them, including bad labels. See this nice post on the topic:

https://medium.com/@amiralush/large-image-datasets-today-are-a-mess-e3ea4c9e8d22

Iyar Lin
  • 809
  • 5
  • 18
13

In addition to what Iyar Lin has said, I'll mention that there is no such thing as a large organic image dataset with no errors. Even if we were to get many intelligent humans to analyze each image, lots of the images would have too much disagreement to say what is the "correct" label.

I and another person wrote a report analyzing a small part of the OpenImages dataset. Here is a bit of what we learned. (Full disclosure, this includes some quotes and paraphrasing of that report, but I cannot cite it as it is not public.)

image of pizza?

Look at this image. It appears to feature prosciutto, arugula, and a sprinkling of cheese on top of a toasted base. Is it pizza? It is clear what is contained in the photo, but your answer depends on how you define pizza. You can ask several people, and you will get answers ranging from "probably pizza" to "definitely not pizza" and everywhere in between. No matter whether you label this as a pizza or not, there will be people who say you're wrong.

box with photo of pizza

Now consider this. It is a pizza box with a picture of a pizza on it. Is it pizza? Again, it's clear what is in the photo, but there's no right answer to what counts as a pizza, since it depends on context. It is clearly a picture of a picture of pizza, but you sometimes might want to count it as pizza, and sometimes might not.

unclear food

Lastly, consider this dish. Unlike the past two photos, it is unclear what's in the photo. There's definitely eggs on top, but underneath there are things that could be dough, cheese, and pizza sauce, or could be scrambled eggs and ketchup. Assuming for the sake of the argument that pizza can have fried eggs on top, whether or not this counts as pizza cannot be determined from the photo.

To conclude: while it may be possible to make a "100% correct" dataset by discarding ambiguous samples, it would no longer represent what's in the real world. In the real world, things are ambiguous, and there are "errors" that will always exist due to the philosophical ambiguity of our environment.