Why 10B Tokens Suffice for GPT Training Performance

AI Dynamics

Global AI News Aggregator

Why 10B Tokens Suffice for GPT Training Performance

–

28 May 2024 19h52

Great question yes I was surprised that 10B seemed enough. I believe GPT-2 was trained on somewhere ~100B tokens. The reason we reach this performance in 10B tokens I think may be the following: 1. FineWeb could just be higher quality than WebText, on a per-token basis. This was

→ View original post on X — @karpathy,

28 May 2024

AI GENERATIVE AI INNOVATION LLMS MACHINE LEARNING RESEARCH

AI Dynamics

Why 10B Tokens Suffice for GPT Training Performance

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

Cheaper exploration at scale remains advantageous despite no new exploits

Gold Status Experience Brings Satisfaction

Using ChatGPT for Essay Feedback and Improvement

Intelligence Gone Wrong: Cheating Despite Having Correct Answer