Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] What is the inductive bias in transformer architectures?

I’ve been thinking a lot about the question of inductive biases recently; basically equipping a model with a set of assumptions in order to make it prefer certain solutions. This can happen in different ways, like the model architecture, the loss or regularization.

In NLP, RNNs are (still) very popular because through their recurrency they exhibit an inductive bias that makes them temporally invariant, but recent work (like this) seems to suggest that this way they also suffer from a recency bias, which might inhibit their application to language.

Now that transformers dominate the leaderboards in many NLP tasks, I was wondering which kind of inductive bias they might carry given their architecture?

submitted by /u/Kaleidophon
[link] [comments]