Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Global Context block spatial resolution

[D] Global Context block spatial resolution

The paper in question is here: https://arxiv.org/pdf/1904.11492.pdf

The authors claim that assuming similar attention level for different Query points, a lot of computation can be saved by making a query-independent self-attention layer. That sounds good, but the following diagram of their architecture is confusing to me:

diagram 4(d) from the paper

After the Transform section, when the result is added back to the original image, each channel only gets one value broadcast over the entire plane. I had assumed that the goal was to calculate a global attention map (i.e query-independent and key-dependent). Could someone please explain why this is?

submitted by /u/eukaryote31
[link] [comments]