Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
https://medium.com/@patry.nicolas/model-based-bpe-encodings-dd664c959a90
Summary : Idea to generate Byte pair encodings, not based on frequency in the dataset, but on the quality of the prediction of our model. This enables us to predict multi word tokens like “New York” and address languages that don’t use spaces to split words.
Author here : I’m not a researcher, and could not find any paper related to that idea, if you know about any research in that direction please let me know. Or any comments on the post.
submitted by /u/narsilouu
[link] [comments]