[D] GPT2 as seq2seq decoder
Hello! Not having the computational resources to train a seq2seq transformer-based model, I’m trying to do that by fine-tuning BERT as an encoder and GPT2 as a decoder. Has anyone tried something similar? How can I condition GPT2 on the encoder’s output?
submitted by /u/Viecce
[link] [comments]