[D] Questions on general research practices
I have a few questions on research practices that are generally followed but rarely mentioned in papers.
- Let’s say I have a train-dev-test split. After finding the best hyperparameters on dev set, should I retrain the model on train+dev set before evaluating it on the test set? Some discussions say yes, others say it depends on you and how much data you have and some say no.
- Let’s say I’m showcasing results on multiple datasets. Can one change the hyperparameters (learning rate, batch size, etc) from one dataset to another? More importantly, can I change, let’s say, number of units in a layer without adding more layers? Would this count as an architectural change?
- If yes, how would answer to above question change if the same is done within the dataset itself containing multiple parts?
- Are we allowed to change a publically available dataset? For example, removing outliers for a regression problem?