When I understood correctly, they used 16k for finetuning, but it can handle up to 100k even?
Fine-tuning with 16k tokens but capable of handling 100k
By
–

By
–

When I understood correctly, they used 16k for finetuning, but it can handle up to 100k even?