This paper tries to solve a non-issue actually . Their claim is that when you do packing (they call it concat and chunk lol) you get cross document attention leakage. The truth is that if your infra is decent you'll have segmentation masks that prevent this from happening in
Cross-Document Attention Leakage in Sequence Packing Solutions
By
–
Leave a Reply