Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Is there a way to prove that there is no cluster in a population?

For example, I am looking for cluster in a dataset of 10000 binary variables. I reduce the number to 50 variables with a PCA, then apply t-sne on it to find clusters in the output.

There are separated shapes in the visualisation of tsne, but I understood that points that are far apart in the output are not always far in the higher dimension space. T-sne can find clusters in a normally distributed dataset.

Can we to prove that data follows a normal distribution in all directions of space? Is there a way to remove variables that would add noise to clustering? Like some kind of variable selection but for clustering?

submitted by /u/elpiro
[link] [comments]