[D] ELI5: GPT-2 Model Size?
So, I know that the code for GPT-2 345M is publicly released, and there are people who’ve been training GPT-2 on various things, such as magic cards, cat names, and facebook messenger posts. I guess my question is, what is preventing people from training their own 1.5B model? Heck, what does the parameters mean?
I’ve got a coding background but I’m only familiar with the basics of neural nets; only been following it because media synthesis is really fun.