[P] [D] StripNet: Towards Topology Consistent Strip Structure Segmentation

I am trying to understand this paper which introduces StripNet. Nevertheless, due to the fact that the authors do not explain the steps really well, I’m having many troubles in understanding what they do.

1) Is ROIAlign just a Mask-R CNN? I do not really understand why they use it and it’s purpose.

2) After they partition the retinal layer region into 16pixel – wide partitions (why 16?), how do they classify the layers in each one?

Could someone who actually understands what’s going on in the paper help me? A simple explanation of the pipeline would be very helpful as I do have a ton of questions.

