Hope you are enjoying the book! And thanks for sharing your notes! Btw regarding your nits, I introduced the masked token prediction task of BERT in Fig 1.5 and even refer to BERT-like "LLMs", but you are right, I mostly equate LLMs with "next-word prediction" for simplicity's
BERT masked token prediction and LLM next-word prediction clarification
By
–
Leave a Reply