AI Dynamics

Global AI News Aggregator

Whisper Model 8-bit Loading: Memory-Efficient Inference

Like all the models on the transformers models, all Whisper checkpoints can be loaded in a memory-efficient way! With load_in_8bit=True you can load the model with 8-bit precision. P.S. You can load a Whisper-large model < 6.6 gig VRAM

→ View original post on X — @reach_vb,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *