Agree with you. Do you mean the 32B distilled models or the 3B one I hinted at at the end? I think 32B should be just fine but yeah, 3B will be more tricky; I'd say that one is more for educational purposes.
Distilled Models: 32B vs 3B Trade-offs and Use Cases
By
–
Leave a Reply