They actually might be doing this already but yeah, probably diminishing returns.
Some insights in the "To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis" paper: https://
arxiv.org/abs/2305.13230
Scaling LLM Performance: Token-Crisis Solutions and Diminishing Returns
By
–
Leave a Reply