[D] PyTorch implementation best practices
Hi r/MachineLearning! Let’s discuss PyTorch best practices.
I recently finished a PyTorch re-implementation (with help from various sources) for the paper Zero-shot User Intent Detection via Capsule Neural Networks, which originally had Python 2 code for TensorFlow.
I’d like to request perhaps a critique on the code I’ve written so far (it’s not perfect, yet!) and any suggestions if there are best practices specifically in PyTorch, for implementing directly from research papers as well as converting them from other frameworks.
Some thoughts I had while programming (feel free to raise more!):
I’ve been implementing a Dataset class and custom batch functions for every dataset I’ve been working with. Is this the PyTorch best practice?
Where is the optimal place to shift
.cuda()? I’ve been doing this in the training loop, just before feeding it into the model.
How to manage the use of both
torch, seeing as PyTorch aims to reinvent many of the basic operations in
If you’re a fellow PyTorch user/contributor please share a little!