Can shorter answers be smarter? Researchers say yes.
— 机器之心 JIQIZHIXIN (@jiqizhixin) 9 novembre 2025
New method DLER pairs simple truncation with better RL tricks to cut output length by >70% while improving accuracy over prior baselines, such as batch reward normalization, higher clipping, dynamic sampling.
It also scales… pic.twitter.com/75UXfQ0a6K
Can shorter answers be smarter? Researchers say yes. New method DLER pairs simple truncation with better RL tricks to cut output length by >70% while improving accuracy over prior baselines, such as batch reward normalization, higher clipping, dynamic sampling. It also scales
