I'll use the example of classifying pumpkins. Take the example of the Cinderella pumpkin
Versus the gourd pumpkin
Intuitively, it may seem wise to classify these images as two different outputs,
gourd-pumpkin, due to how different they look.
My question is, if I take a training set of images that includes both cinderella pumpkins and gourd pumpkins and classify both of them under the category of
pumpkin, will the performance of the network be worse than if I instead separated them into two categories? What is about the threshold for when two objects are so different that they should be put into separate categories?
Or to take a more extreme example for the sake of clarity, if I took pictures of cats and pictures of pineapples and classified them under the same category, how would the ability of the network be affected in classifying each respective object in comparison to if one created a
cat output and a