yeah this is definitely true! one idea would be to use a smaller language model to estimate the most probable tokens and then choose a certain subset of them to query the API with
Using Smaller Language Models to Optimize API Token Selection
By
–
By
–
yeah this is definitely true! one idea would be to use a smaller language model to estimate the most probable tokens and then choose a certain subset of them to query the API with