Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Mapping parallel data that shares the same vocabulary


Let’s say we want to translate between two sequences that share the same vocabulary.

We assume that the vocabulary is: V = [A,B,C,D,E,F,G]

We have this parallel data:

Source: [A B C C , A F G]

Target: [E B C C, E F G]

This was just an example

It we want to represent any sequence. We can use a vector that contains the counts of each element from the vocabulary.

So A B C C = [1,1,2,0,0,0,0]

A F G = [1,0,0,0,0,1,1]

E B C C = [0,1,2,0,1,0,0]

E F G = [0,0,0,0,1,1,1]

As we said that A B C C = E B C C and A F G = E F G, then their vectors must be the same to some extent. Like we can have something like this:

A B C C = E B C C = [1,1,2,0,1,0,0]

The first idea was to train a seq2seq model and try to extract the encoder mapping representation of the sequence. But it looks that the encoder encode just the source sequence representation not the mapping.

Is there any algorithms that can perform this task?

submitted by /u/kekkimo
[link] [comments]

Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.