[P] Not sure if this is too silly for this reddit but I used GPT-2 345M to recreate the famous (but fake) “I forced a bot to write an Olive Garden commercial” tweet, but I fine-tuned GPT-2 on Star Trek scripts first…
Link To The Post:
What is this?
About a year ago, a tweet went viral purporting to be written by a bot: I forced a bot to watch over 1,000 hours of Olive Garden commercials and then asked it to write an Olive Garden commercial of its own. Here is the first page.
It was a funny script but it was written by a comedy writer, not a bot. This was obvious to some people, but not others, and enough people that it was real that sites wrote articles debunking it.
I’ve had this in the back of mind for while. It used to be obvious that scripts like this were fake — anyone who had used an RNN network knew they could only hold a coherent thought for about two sentences, there’s no way it would produce something like that. But since that tweet GPT-2 came out and changed everything. I was looking for something to test fine-tuning the new GPT-345M on and picked the dumbest silliest option.
I can’t tell you exactly why I also also mixed this with Star Trek The Next Generation and Deep Space Nine scripts except that it I found it hilarious.
How it works
The training set is all Star Trek, but GPT-2 is shockingly good at writing lines appropriate for a waitress or restaurant scene that are 100% are not in the Star Trek script training set. I love how creative it can be since it has such a wide range of pre-baked knowledge.
Are these hand selected samples?
Nope. The title says 1,000 commercials but it’s actually over 30,000 commercials now — I couldn’t resist trying different prompts and tweaks to the training data to get better results. I didn’t filter the samples at all so it’s a wide mix of iterations, temperatures, learning rates, tweaked training sets, I pretty much just tossed everything up there. I was overall impressed with how well GPT-2 worked with both very little fine-tuning, and how it avoided overfitting with a lot more training. I trained these same Trek scripts with the smaller GPT-2 and had over-fitting problems eventually but I never hit that point with 345M. I suppose it’s possible it just trains a lot slower…