[P] OpenGPT-2: We Replicated GPT-2 Because You Can Too
The author trained a 1.5 billion param GPT-2 model on a similar sized text dataset called OpenWebTextCorpus and they reported perplexity results that can be compared with the original model.
Recently, large language models like BERT¹, XLNet², GPT-2³, and Grover⁴ have demonstrated impressive results in generating text and on multiple NLP tasks. Since Open-AI has not released their largest model at this time (but has released their 774M param model), we seek to replicate their 1.5B model to allow others to build on our pretrained model and further improve it.
https://medium.com/@vanya_cohen/opengpt-2-we-replicated-gpt-2-because-you-can-too-45e34e6d36dc
submitted by /u/baylearn
[link] [comments]