LLaMa2-70B Training Computation Corrected Calculations

AI Dynamics

Global AI News Aggregator

LLaMa2-70B Training Computation Corrected Calculations

–

31 October 2023 5h29

Corrected Calculations (the original claim doesn't change, as it goes from 5e+24 -> 1.1e+24 and is still two orders of magnitude behind):
LLaMa2-70B was trained on 2T tokens, and 1.7m hours of A100 GPU time. At a HFU of ~60%, LLama2-70B took ~1.1e+24 flops (1.7m * 312TFlops

→ View original post on X — @soumithchintala,

31 October 2023

AI Dynamics

LLaMa2-70B Training Computation Corrected Calculations

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring