AI Dynamics

Global AI News Aggregator

About

Off-by-One Error in Autoregressive Model Training

hahah I’ve made this mistake too; it’s pretty easy to do in autoregressive model-training. just an off-by-one error: predict token T from token T instead of T+1

→ View original post on X — @jxmnop