Regulation starts at roughly two orders of magnitude larger than a ~70B Transformer trained on 2T tokens — which is ~5e24. Note: increasing the size of the dataset OR the size of the transformer increases training flops. The (rumored) size of GPT-4 is regulated.
AI Model Size Regulation and Training Compute Requirements
By
–
Leave a Reply