Learn About Our Meetup

5000+ Members



Join our meetup, learn, connect, share, and get to know your Toronto AI community. 



Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.



Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[D] Is there a well maintained list of good “benchmark” datasets for ML ?

I’m looking for up to date datasets to benchmark various algorithms against the performance (both speed and accuracy) of published models.

I’ve found some dataset but the main issue is that they are either:

a) very old and small, e.g. most datasets hosted by UCI, which are rather “easy” to “solve” nowadays and most papers using them came out decades ago. Even barring that, a lot of the papers dealing with the data are not ideal for benchmarks per-say because they are not very specific in their methodology for splitting into train/test/validate.


b) They are focused on images, e.g. cifrar 100 is pretty decent, and there are loads of high quality models with known accuracy and available source code… but, I can’t find the equivalent of cifrar 100 for, say, financial timeseries prediction, or STT, or geospatial movement predictions for cars… or any problem other than image classification -_-

Are there any well maintained list of datasets that specifically have various models benchmarked against them ? Or would it be better to just do reverse-search on this problem, as in, look for interesting papers that came out in the last few years and use the datasets they used.

submitted by /u/elcric_krej
[link] [comments]

Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.