fwiw i think its better to have one parameter that blends input:output token ratio (eg, 3:1 is pretty common), and compute blend cost that way. reserve the screen real estate for the obvious issue here with the cost, which is %MMLU or blend benchmark score tradeoff
Token Ratio Parameter and Cost-Performance Tradeoff in LLM Pricing
By
–
Leave a Reply