GPT-3 Training Surpasses Expected Performance on FineWeb Dataset

AI Dynamics

Global AI News Aggregator

GPT-3 Training Surpasses Expected Performance on FineWeb Dataset

–

02 June 2024 19h19

Example here is the llm.c GPT-3 (124M) training on FineWeb (figure cropped at 250B tokens), we seem to surpass GPT-3 HellaSwag (green line) at ~150B tokens, per paper expected this to be at 300B tokens. Will re-run with FineWeb-Edu. I do want to be a bit careful on conclusions

→ View original post on X — @karpathy,

2 June 2024

AI GENERATIVE AI INNOVATION LLMS MACHINE LEARNING OPEN SOURCE RESEARCH

AI Dynamics

GPT-3 Training Surpasses Expected Performance on FineWeb Dataset

Commentaires

Leave a Reply Cancel reply

MORE ARTICLES

AI Generates Perfect Jokes Using Image Generation Skills

Codex App Transformation: Atlas Integration Reshapes User Experience

AI File Access Limitations: Screenshot vs Disk Storage Issues

Synthetic Aperture Radar: Satellite Tech for Global Monitoring