AI Dynamics

Global AI News Aggregator

About

BERT masked token prediction and LLM next-word prediction clarification

Hope you are enjoying the book! And thanks for sharing your notes! Btw regarding your nits, I introduced the masked token prediction task of BERT in Fig 1.5 and even refer to BERT-like "LLMs", but you are right, I mostly equate LLMs with "next-word prediction" for simplicity's

→ View original post on X — @rasbt