This is the interesting bit. As we move to the next phase of scaling reasoning models w RL, data and compute converge. Next breakthroughs require moving from 100’000’s of GPUs to millions of GPUs.
Scaling Reasoning Models: From Thousands to Millions of GPUs
By
–
Leave a Reply