[R] TextVQA Challenge: Close the large gap between human accuracy and state-of-the-art.

Written by torontoai on May 12, 2019. Posted in Reddit MachineLearning.

Dataset Website: https://textvqa.org

Challenge Link: https://evalai.cloudcv.org/web/challenges/challenge-page/244/overview

Prize: $10k GCP Credits

Starter Code: https://github.com/facebookresearch/pythia

Paper: https://arxiv.org/abs/1904.08920

Deadline: 18th May (ask for extension if needed)

More details on the challenge: https://textvqa.org/challenge

Explore the dataset: https://textvqa.org/explore

Detailed Description:Current state-of-the-art VQA models are unable to read and reason about text in images which in contrast is most asked by the users of such systems. TextVQA aims to provide a benchmark for measuring progress of VQA models on text reading and reasoning capabilities.

State-of-the-art VQA models on TextVQA are only around 14% while the human accuracy is ~85%. LoRRA module introduced in TextVQA paper can be attached to any VQA model to add text reading and reasoning capabilities. The current state-of-the-art on TextVQA is ~27% with LoRRA.

Use the starter code to participate in challenge to win $10k GCP credits and help close this large gap.

submitted by /u/apsdehal
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[R] TextVQA Challenge: Close the large gap between human accuracy and state-of-the-art.