[Research] What happens when CNN devised for 224x224x3 images is trained on high resolution imagery ?
I’m training a CNN on high resolution satellite imagery. Because of hardware constraints, I’m using EfficientNetB0
(EB0
) to predict classes from 800x800x3 tiles. This can be done thanks to the 2D GlobalAveragePooling
layer at the end of the model which compresses the 25x25x1280 feature map (7x7x1280 when working with the original 224×224 images) into a 1D 1280 vector to be fed to dense layers and such. It works surprisingly well. Even more, I get better performance with EB0
used on 800x800x3 tiles than with EB5
applied on 456x456x3 tiles (the resolution for which they were designed).
How is this possible?
submitted by /u/Many_Consideration
[link] [comments]