Hyperparams and architectures are *far* more stable across model and data sizes than most people think. For stuff that does need to change, we have pretty reliable rules of thumb for nearly all of them. (eg: OpenAI did nearly all their GPT4 ablations on ~1000x smaller models.)
