[R] TextVQA Challenge: Close the large gap between human accuracy and state-of-the-art.
Dataset Website: https://textvqa.org
Challenge Link: https://evalai.cloudcv.org/web/challenges/challenge-page/244/overview
Prize: $10k GCP Credits
Starter Code: https://github.com/facebookresearch/pythia
Paper: https://arxiv.org/abs/1904.08920
Deadline: 18th May (ask for extension if needed)
More details on the challenge: https://textvqa.org/challenge
Explore the dataset: https://textvqa.org/explore
Detailed Description:Current state-of-the-art VQA models are unable to read and reason about text in images which in contrast is most asked by the users of such systems. TextVQA aims to provide a benchmark for measuring progress of VQA models on text reading and reasoning capabilities.
State-of-the-art VQA models on TextVQA are only around 14% while the human accuracy is ~85%. LoRRA module introduced in TextVQA paper can be attached to any VQA model to add text reading and reasoning capabilities. The current state-of-the-art on TextVQA is ~27% with LoRRA.
Use the starter code to participate in challenge to win $10k GCP credits and help close this large gap.
submitted by /u/apsdehal
[link] [comments]