AI Dynamics

Global AI News Aggregator

Stage 3 Offloading and CPUAdam Performance Comparison with FSDP

Relatively similar. I think stage 3 with offloading and CPUAdam was even a tad better but I’d have to double check again on Wed when I am back at my computer. I usually use DeepSpeed but opted for FSDP here to reduce external dependencies.

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *