[P] Realtime Detection of handwitten digits
I want to implement a handwritten digit detector that works in Realtime. I want to draw boxes over the video stream to indicate the user that the digits are detected. Training and Detection is made utilizing a GPU.
Due to the fact I am relatively new to tensorflow and CNNs I investigated some nets and tried to train them with MNIST and some other labeled images I made myself.
Until now I used: SVM, LeNet-5, R-CNN and yolov2
From what I have read so far, I think yolov2 or yolov3 would be an appropriate neural network for the task. Because they are very fast in detection. But there are so many layers and it seems to be very complex.
Do I need to choose such a complex CNN? Originally it was intended for 3D object detection with many classes and I only use it for 2D digits (only 10 classes).
Like I already said, I am new to the topic, so be nice…^^