The economics just flipped completely. Training GPT-4: $100M+ in compute
Inference scaling: $0.10 per complex query You can make a 7B model as smart as GPT-4 by letting it think 100x longer at inference. Smaller models + more thinking time = beats bigger models at fraction of
Economics flip: small models with more inference match GPT-4
By
–
