[P] What is the best way to read data in batches from a datastore?
I’m training a PyTorch model on data stored in Google BigQuery (basically a SQL like database). What’s the best way to fetch data in batches for training a model such that I don’t bottleneck the training process. Querying too many times is slow and the dataset is way too large to fit into memory. Any best practices here or existing tools for this?
submitted by /u/MrDoOO
[link] [comments]