Fuck it – it’s raining smol LMs – SmolLM2 1.7B – beats Qwen 2.5 1.5B & Llama 3.21B, Apache 2.0 licensed, trained on 11 Trillion tokens > 135M, 360M, 1.7B parameter model
> Trained on FineWeb-Edu, DCLM, The Stack, along w/ new mathematics and coding datasets
> Specialises in
SmolLM2 1.7B Beats Larger Models with Apache 2.0 License
By
–
Leave a Reply