Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[Project] Model Based Byte Pair Encoding

https://medium.com/@patry.nicolas/model-based-bpe-encodings-dd664c959a90

Summary : Idea to generate Byte pair encodings, not based on frequency in the dataset, but on the quality of the prediction of our model. This enables us to predict multi word tokens like “New York” and address languages that don’t use spaces to split words.

Author here : I’m not a researcher, and could not find any paper related to that idea, if you know about any research in that direction please let me know. Or any comments on the post.

submitted by /u/narsilouu
[link] [comments]