[R] [D] NLP, Any papers on text summarization on very long (arbitrary length) text?
Hi, I’m catching up on the text summarization scene and most of the papers I have seen are using the CNN,newsroom,xsum datasets; but the max document size for any of these seem to be ~1000 tokens. Are there any papers that deal with very long (or arbitrary) document lengths?
As I understand it, most of the SOTA now is transformer based and they are bound by the # of positional embeddings in use.
submitted by /u/natural_language_guy
[link] [comments]