Agree!! I'm using very conservative settings for a lot of the hyperparameters (following GPT-3 paper when possible) and haven't tried to speed this up at all yet, but I expect a 10X multiplier here should be possible.
Conservative hyperparameters with potential for 10X speedup
By
–
Leave a Reply