[D] Architectural question: multiple input tensors, how best to combine to single output tensor?
Sorry if this question has been asked before. I’m making a classifier which takes as input multiple tensors (representing images) and produces a single output (prob. distribution) . Each of the inputs have a few stacks of residual blocks on top, and I’m wondering how best to combine the output of each of these branches. As of now, I’m simply producing logits for each branch and doing an element-wise sum over them (with coefficients for each branch as one of the input tensors is much more important than the others). Is there a better approach (I’ve heard concatenation is another approach here, but not sure which would be better)? Should I create a loss expression for each branch and sum those loss expressions instead? Thanks for any clarity you guys can provide me with.