[P] Character-based OCR model?
Has anybody achieved good results with character-based OCR? I’ve been struggling to train a CNN to recognize common fonts – seems like the best accuracy I can squeeze out is around 75-80%.
This gist shows the Keras model I’ve been working with (and some variations). Fast inference is important for this application, so I’m trying to keep it as lightweight as possible. Character boxes are scaled down to 28×28.
We only need to recognize onscreen text (web pages, documents, etc) – pretty much ideal circumstances for OCR. Some mistakes are expected at the character level (e.g. I vs l vs 1), but what we’re seeing is significantly worse and sometimes pure gibberish. Training data is synthetic but virtually identical to real-world text.
Is it realistic to achieve decent accuracy from a character-based net? Adding recurrent layers and a dictionary would be pretty heavy for this application, so we’re hoping to avoid it. If anybody can provide recommendations I’d love to hear it.