For harder tasks OTOH, only a large enough model will be incentivized to improve their loss (because by definition, they are harder). For example, GPT-4 is way better at math or physics than GPT 3.5. These abilities might appear only when you reach a certain model size.
Model Size and Emergence of Specialized Capabilities in AI
By
–
Leave a Reply