[D] Unzipping dataset in google drive using google colab is slow
I have my dataset sitting in my google drive as a .zip file. To unzip it I use this command:
!unzip -n /content/drive/AudioMNIST/dataset.zip -d /content/drive/AudioMNIST/
But it is incredibly slow, it takes a couple of seconds to extract one file (total number of files is 30000). I don’t have a deadline, but this speed is frustrating. Things I have tried so far:
- Uploading my dataset unzipped to my google drive, but it just won’t do it over the browser with the huge number of files.
- Uploading file by file to a folder in google drive using gdrive from my terminal, but apparently it’s not supported anymore. And I couldn’t figure out how to set up the credentials with the API
- And then there’s the cloud unzipper “Cloud Convert”, which is relatively fast but requires a subscription to obtain more than 25 minutes of conversion time daily. Another one was “ZIP Extractor”, which is faster the above colab command, but not by much.
Preferably I would like to upload my dataset file by file from my terminal with a python script or so. Has anyone been in this position?
submitted by /u/khawarizmy
[link] [comments]