Blog

Learn About Our Meetup

5000+ Members

GO >

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community.

JOIN

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

JOBS

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

CONTACT

[D] Threshold for rejecting word embedding similarities

Written by torontoai on July 1, 2019. Posted in Reddit MachineLearning.

I have a problem where I have certain set of target words and I need to use them to match with other words that are found in new csvs. I was wondering if there are any good approaches to determining the threshold for rejecting word similarities. I was thinking using a random sample of 10k words and plot their similarities (10k*9.99k/2) but I am not sure whether this is the right approach. Or should I use the distribution of the similarities of the target words on a vocabulary and choose a percentile cutoff? Any ideas?

submitted by /u/radcapbill
[link] [comments]