AI Dynamics

Global AI News Aggregator

Meta Releases MobileLLM: Depth Critical Over Width

Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints! Notes on the release: Depth vs. Width: Contrary to the scaling law (Kaplan et al., 2020), depth is more critical than width for small LLMs, enhancing abstract concept capture and final performance Embedding

→ View original post on X — @reach_vb,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *