Self-supervised = autoregressive, next-word prediction like in pretraining? That's what I had in mind with "like pretraining" but in a shorter way to make it fit into the 260 character limit
Self-supervised Learning and Autoregressive Next-word Prediction Clarification
By
–
Leave a Reply