[D] Has anyone used context to improve object detection and image classification?
We do a lot of image classification using the Tensorflow Object Detection API.
Our images often appear in groups, e.g. a cluster of fish swimming by a camera. Ofter our model will recognize some of the fish – but not all of them. This is obviously a mistake a human would never make.
I am researching whether there are any examples of object detectors/image classifiers using context to improve results? I.e. knowing that there are three fish swimming through would increase the model’s propensity to find a fourth nearby?
Another way to potentially attack this problem would be to identify clusters of objects -> then reexamine the cluster only to identify the number of objects within the cluster. The complexity with this issue is that some of the objects we are examining appear in clusters, while others do not.
So to summarize, my two questions are:
- Is there research on, and how would you suggest I go about improving an object detector by using contextual features within an image?
- In situations where there are clusters of objects, would you recommend recognizing those clusters as individual images and then subsequently processing them to identify the number of images within the cluster? Are there examples / is there research on this?
Thanks in advance – obviously doing my own research as well, but keen to hear if the community has any thoughts/examples!