[D] What are the current SOTA architectures for NLP information extraction & question answering?

Been primarily working in a different field of DL for a while, but got a project coming up related to NLP. I’ve done some research though the most frequent ones that seem to be showing up are GPT-2, BERT, and ELMo. However, I am under the impression that these are burying others that may be better suited for the task.

If it’s of relevance; my domain expertise is in medicine, and intend to use it for medical purposes.

