llm.c Outperforming GPT-2/3 with Fewer Training Tokens

AI Dynamics

Global AI News Aggregator

llm.c Outperforming GPT-2/3 with Fewer Training Tokens

–

02 June 2024 19h10

In llm.c pretraining we were already mildly perplexed why seem to be outperforming GPT-2 & 3 (124M) training on just 10B tokens instead of something closer to 100-300B, per the original papers. I suspect a good chunk of it may be just the dataset quality, so I'm eager to retrain

→ View original post on X — @karpathy,

2 June 2024

AI CODE INNOVATION LLMS MACHINE LEARNING OPEN SOURCE RESEARCH

AI Dynamics

llm.c Outperforming GPT-2/3 with Fewer Training Tokens

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring