Skip to main content


Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[R] YOLACT: Real-time Instance Segmentation


tl;dr: Instance Segmentation slow, YOLACT make fast (29.8 COCO mAP, 33.5 Titan Xp fps).

Hi all, my paper was recently accepted ICCV 2019 Oral so I thought I’d post it here. (Note: fps numbers were rebenchmarked for ICCV and I haven’t updated it elsewhere).

Today, object detection has several methods that do well (e.g., Faster R-CNN+++, RetinaNet), and several that do well enough but are also fast (e.g., YOLOv2-3, SSD). On the other hand, the same isn’t true for instance segmentation. We have good methods (e.g., Mask R-CNN and its derivatives, Retina-Mask), but no fast methods that do well enough on a complex dataset like COCO.

YOLACT changes this. We obtain 29.8 mAP (30.1 after a stupid bug fix, but the paper’s out now >.>) on COCO at 33.5 fps on a single Titan Xp, making YOLACT the best fast instance segmention method out at the moment. And it’s simple: predict a set of k basis masks (prototypes) over the whole image and in parallel predict a set of k linear combination coefficients (mask coefficients) for each detection. Then to generate masks for a detection, just multiply the mask coefficients into the prototypes and add (which can be implemented as one matrix multiplication per image). This whole process takes ~5-6 ms to add a masks to any existing object detector.

I also came up with “Fast NMS”, a close approximation to traditional per-class NMS that’s 12ms faster.

Feel free to AMA.

submitted by /u/dbolya
[link] [comments]