AI Dynamics

Global AI News Aggregator

About

GPT Model Capacity Analysis: Linear Trend at 3.6 Bits-Per-Parameter

we then compute the capacity of different models (GPT models with varying numbers of layers and hidden dimensions) averaged over hundreds of models in fp32, we get the following curve, indicating a linear trend of around 3.6 bits-per-parameter, regardless of the exact details:

→ View original post on X — @jxmnop