Join our meetup, learn, connect, share, and get to know your Toronto AI community.
Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.
Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.
I have a problem where I have certain set of target words and I need to use them to match with other words that are found in new csvs. I was wondering if there are any good approaches to determining the threshold for rejecting word similarities. I was thinking using a random sample of 10k words and plot their similarities (10k*9.99k/2) but I am not sure whether this is the right approach. Or should I use the distribution of the similarities of the target words on a vocabulary and choose a percentile cutoff? Any ideas?
submitted by /u/radcapbill
[link] [comments]