Evals show that our SFT (LFM-1.3B-Distill) model performs slightly better with a 32k token budget. It's actually competitive with models based on DeepSeek-R1-Distill-Qwen-1.5B while being 15% smaller.
LFM-1.3B-Distill SFT Model Outperforms Larger DeepSeek Competitor
By
–
