Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Why does backtranslation work?

I think I must be misunderstanding how backtranslation, because I’m not seeing how this could help. I’ll describe my current understanding then I’ll ask my question.

The usual setup is that you have some some small set B of parallel data between a source and target language. Your goal is to make a model that a language in the source language and produced the translated version in the target language.

In addition to the small dataset B, you also have some potentially very large corpus A of monolingual data in the target language. In order to leverage this data, you train a model in the reverse direction i.e target to source, by using B with the entries flipped. Then you use this model to make A’, which consists of the translations of entries in A by using the reverse model. Finally, you add A’ to B, get some final set C which you then train source –> target model.

In some sense, this should only help if your target –> source model is good. However, you trained this model only on B. This raises the following questions:

1) if you can build a good target –> source model from just B, why can’t you do the same with source –> target?

2) If you do get some improvements, why can’t you continue this process again? i.e. Train the source –> target model using C, then grab some large monolingual corpus from the source language, backtranslate that to make some new set A”, then add A” to C and re-train the target –> source model then make more source –> target examples by backtranslating the new model? Rise and repeat till you run out of compute.

Finally, is there a good reference for this kind of stuff? Most papers which use backtranslation are extremely vague about it.

submitted by /u/TheRedSphinx
[link] [comments]