Blog

Learn About Our Meetup

5000+ Members

MEETUPS

LEARN, CONNECT, SHARE

Join our meetup, learn, connect, share, and get to know your Toronto AI community. 

JOB POSTINGS

INDEED POSTINGS

Browse through the latest deep learning, ai, machine learning postings from Indeed for the GTA.

CONTACT

CONNECT WITH US

Are you looking to sponsor space, be a speaker, or volunteer, feel free to give us a shout.

[P] Kannada-MNIST: A new handwritten digits dataset for the Kannada language

[P] Kannada-MNIST: A new handwritten digits dataset for the Kannada language

Dear ML community members,
I’d like to disseminate a new handwritten digits-dataset, termed Kannada-MNIST, for the Kannada script, that can potentially serve as a direct drop-in replacement for the original MNIST dataset.
In addition to this dataset, I disseminate an additional real world handwritten dataset (with 10k images), which we term as the Dig-MNIST dataset that can serve as an out-of-domain test dataset.

Class-wise mean images for the Kannada-MNIST dataset

  1. I also duly open source all the code as well as the raw scanned images along with the scanner settings so that researchers who want to try out different signal processing pipelines can perform end-to-end comparisons.
  2. I provide high level morphological comparisons with the MNIST dataset and provide baselines accuracies for the dataset disseminated. The initial baselines obtained using an oft-used CNN architecture (96.8% for the main test-set and 76.1% for the Dig-MNIST test-set) indicate that these datasets do provide a sterner challenge with regards to generalizability than MNIST or the KMNIST datasets.
  3. I also hope this dissemination will spur the creation of similar datasets for all the languages that use different symbols for the numeral digits.

ArXiv link: 👉 https://arxiv.org/abs/1908.01242

GitHub link: 👉 https://github.com/vinayprabhu/Kannada_MNIST

Kaggle link: 👉 https://www.kaggle.com/higgstachyon/kannada-mnist

Blog: 👉 https://bit.ly/2H43Vbk

Citation:👉 Prabhu, Vinay Uday. “Kannada-MNIST: A new handwritten digits dataset for the Kannada language.” arXiv preprint arXiv:1908.01242 (2019).

submitted by /u/VinayUPrabhu
[link] [comments]

Next Meetup

 

Days
:
Hours
:
Minutes
:
Seconds

 

Plug yourself into AI and don't miss a beat

 


Toronto AI is a social and collaborative hub to unite AI innovators of Toronto and surrounding areas. We explore AI technologies in digital art and music, healthcare, marketing, fintech, vr, robotics and more. Toronto AI was founded by Dave MacDonald and Patrick O'Mara.