[D] Precise detection of large number of keypoints
Keypoint detection has successfully been modelled with CNNs that outputs a heatmap tensor of size H x W x K, where K is the number of instance keypoints you want to detect and H and W the output size of the heatmaps. If you want precise detections H and W should ideally be the same size as the input image.
I want to detect K>=300 using an input image of size 512 x 512. Due to obvious memory limitations I can’t use the above naive approach that upscales to the original input size.
Is anyone aware of some research that addresses this specific issue?