[P] AI ping pong game w/ object detection on Raspberry Pi 4 & Google Coral USB Accelerator
I recently got my hands on the Raspberry Pi 4 and the Google Coral USB Accelerator and i decided to program a little fun game in order to get an idea of the performance of these devices. This is a video summary about it:
- The machine learning model used is a MobileNet SSD v2 trained on faces, which is publicly available.
- The video frames processed by the machine learning model have the dimension 720 x 480 pixels
- The frames per second during this game fluctuate between 20 and 30 fps
- The program is written in Python and besides the TFLite model, most of the program is achieved with OpenCV
What i learned:
- The combo of a Raspberry Pi 4 which now has USB 3 and the Google Coral USB Accelerator are a powerful and cheap setup and should definitely evaluated for projects that deal with machine learning inference on the edge.
- The Raspberry Pi 4 CPU gets pretty hot, it sometimes went up to 80 degrees celsius! Although the heat issue might be partly fixed with a Raspbian update, i highly recommend to get an active cooling solution like a fan, a heat sink only lowered temperatures by 3-5 degrees in my case.
- I prefer working with these two devices over the Google Coral Dev Board because the latter one runs Mendel OS which i find hard to work with, there is a lot of restrictions when you get started and i had a hard time to find solutions, even to achieve something as simple as a right click or open a visual explorer. Raspbian OS on the other hand has just so much more support because of its great community. I haven’t had my hands on the Jetson Nano from NVIDIA so i cannot make a comparison here.
- I tried to convert other objects into ping pong bats which can be recognized by the MobileNet SSD v2 (COCO), such as bananas but realised that objects need to be quite big or close to the camera in order to be recognized reliably by the relatively small image resolution. Smartphones and books worked quite well with that model but not as well as faces.
What do you think about these devices combined? Do you know of a better device or solution for machine learning inference on the edge?
In any case, I hope this helps you make decisions for your future projects 🙂 !