[P] How can I make my rendered training data match real data better?

Written by torontoai on December 6, 2019. Posted in Reddit MachineLearning.

I’m trying to detect a single type of boxes from a camera image. Instead of using hand labelled images for training, I want to create the data from a 3D model using blender and a python script.

So far I successfully created a dataset and trained RetinaNet on it. I do apply some augmentation (color shifts, saturation changes, noise, blurring, sharpening).

The results on a validation set (consisting of synthetic data too) are great, but the localization performance on real images is way worse.

What changes should I make to my rendering process to match real images better?

Since it’s a virtual environment, I have pretty much unlimited control over everything, but I have no clue what makes sense to try varying. Some of the detections are flawless, but others are way off and I can’t tell what’s the visual difference that throws the network off.

An example for a rendered image (training set)

Excellent results on validation set (halfway hidden boxes are supposed to be not detected)

Localization problems on real images

submitted by /u/Single_Blueberry
[link] [comments]

Blog

Learn About Our Meetup

5000+ Members

MEETUPS

JOB POSTINGS

CONTACT

[P] How can I make my rendered training data match real data better?