Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Why does the BERT paper say that standard conditional language models cannot be bidirectional?

In the original Bert paper, it is stated on page 4 (bottom, first column) that:

Unfortunately, standard conditional language models can only be trained left-to-right or right-to-left, since bidirectional conditioning would allow each word to indirectly “see itself”, and the model could trivially predict the target word in a multi-layered context.

It’s not at all obvious to me why, if you have the sentence “I like funny cats”, predicting the word “funny” while conditioning on the fact that it’s preceded “I”, “like” and succeeded by “cats” would be trivial and how the model could “indirectly see the target word”.

I saw this question asked on a number of online platforms but it never got a response. It would be great if someone with a good understanding of this could give an explanation

submitted by /u/StrictlyBrowsing
[link] [comments]