AI Dynamics

Global AI News Aggregator

About

Language Models Memory Capacity Scaling Laws Physics

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws very interesting (if somewhat controversial) they declare that models, after visiting data 1000 times during training, can memorize 2 bits/param (they arrive at this number via quantization with AutoGPTQ)

→ View original post on X — @jxmnop