Skip to main content

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Structure-preserving dimensionality reduction in very large datasets

Hi there, we’re a London-based research team working on clinical applications of machine learning. Recently, we’ve been dealing a lot with clinical datasets that exceed 1M+ observations and 20K+ features. We found that traditional dimensionality reduction and feature extraction methods don’t deal well with this data without subsampling and are actually quite poor at preserving both global and local structures of the data. To address these issues, we’ve been looking into Siamese Networks for non-linear dimensionality reduction and metric learning applications. We are making our work available through an open-source project: https://github.com/beringresearch/ivis

So far, we’ve applied ivis to single cell datasets, images, and free text – we’re really keen to see what other applications could be enabled! We’ve also ran a large number of benchmarks looking at both accuracy of embeddings and processing speed – https://bering-ivis.readthedocs.io/en/latest/timings_benchmarks.html – and can see that ivis begins to stand out in datasets with 250K+ observations. We’re really excited to make this project open source – there’s so much for Siamese Networks beyond one-shot learning!

submitted by /u/bering_team
[link] [comments]