hahah I’ve made this mistake too; it’s pretty easy to do in autoregressive model-training. just an off-by-one error: predict token T from token T instead of T+1
Off-by-One Error in Autoregressive Model Training
By
–
Global AI News Aggregator
By
–
hahah I’ve made this mistake too; it’s pretty easy to do in autoregressive model-training. just an off-by-one error: predict token T from token T instead of T+1
Leave a Reply