[P] Have any of you had experience with loading prebatched data into Keras or PyTorch?
Hey guys.
I’m developing a CNN at the moment, and to avoid memory issues I’ve batched my image files into blocks of 500 in my directory (each block of 500 images is a .h5 file). I’m just struggling at the moment in creating a thread-safe generator for either Keras or PyTorch that can loop through all the .h5 files in the directory, load the 500 images, and then push a batch_size quantity of images from that block (let’s say 2 images, if batch_size = 2 ) to the neural network to train on.
I’ve managed to do this somewhat successfully in Keras, using a generator, loops, and yield statements, however this isn’t really suitable for multiprocessing. So now I’m attempting to do this via a Keras sequence or a PyTorch Dataset. I would appreciate any insight that you could offer, and I’ve linked the relevant SO for more information.
submitted by /u/xandrovich
[link] [comments]