[D] 1,000 patent claims by GPT-2
Does anybody know whether the 40G WebText for GPT-2 contains lots of patents? As early as the 36th step of fine-tuning, GPT-2 can start generating patent-like text correctly with three special tags (“<|startoftext|>”, “<|endoftext|>”, “@@@”) in our training data. It is really unreasonably effective. Anybody in similar situation during fine-tuning?