AI Dynamics

Global AI News Aggregator

VRAM Requirements for AI Models Across Hardware Architectures

It should work on CPU/ CUDA/ MPS across backends, w.r.t hardware requirements: 1B should take roughly 2GB VRAM to load in fp16/ bf16.
600M should take 1.2 GB VRAM
350M – ~700MB VRAM
125 – ~250MB VRAM Ofcourse at lower quants Q4/ Q8 you reduce this even further.

→ View original post on X — @reach_vb,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *