[D] Should tokens with a very small frequency be removed from the vocabulary before training a word2vec type model?
I believe they should as there is only one/few contexts they are used in so it wouldn’t be possible to learn a good representation for the token.
submitted by /u/searchingundergrad
[link] [comments]