AI Dynamics

Global AI News Aggregator

About

FP8 Training: Understanding Gradient and Weight Data Types

Why does it matter for fp8? Are grads and weights different data types in that case? (Sorry if it's a dumb question – I've never done any fp8 training)

→ View original post on X — @jeremyphoward