(3 cont.) Can you predict the performance of one model from another?
Can you predict the performance of a 128B model on an unseen task, given models up to some smaller threshold size and some performances of 128B models on other tasks?
Predicting Large Model Performance Across Unseen Tasks
By
–
Leave a Reply