@aymericroucher - AI Dynamics

Deep research and Operator limitations; Operator could do deep research

By

–

17 July 2025 19h17

The team acknowledges that some agents are still very limited each in their own way:
– Deep research cannot interact with webpages (no Textbrowser can)
– Operator has trouble reading through long pages personal take: Operator should theoretically be able to also do deep research

→ View original post on X — @aymericroucher

17 July 2025

ChatGPT agent live announcement combining Deep Research, Operator, Terminal code editing

By

@aymericroucher

–

17 July 2025 19h15

ChatGPT agent live announcement underway, with @sama onboard!
This Agent patches together
– Deep Research (TextBrowser )
– Operator (GUI Agent) – Terminal code editing
to unlock all these capacities together in an agent.

→ View original post on X — @aymericroucher

17 July 2025

WebSailor paper uses agentic RL post-training to boost Deep Research scores

By

@aymericroucher

–

17 July 2025 17h37

Recent WebSailor paper by Alibaba-NLP, shows how to post-train models for Deep Research – good insights in there, about creating a dataset then training recipe. I particularly like how the agentic RL at the end of post-training improves scores by ~4 p.p. across the board: RL +

→ View original post on X — @aymericroucher

17 July 2025

Muon(Clip) vs AdamW comparison request

By

@aymericroucher

–

12 July 2025 9h08

@eliebakouch could you do a comparison of Muon(Clip) vs AdamW to explain the differences?

→ View original post on X — @aymericroucher

12 July 2025

Huggingface explains stateless direct response MCP server choice

By

@aymericroucher

–

10 July 2025 11h01

If you're developing MCP servers, you should give a read to how the @huggingface team built the Hub MCP, they explain why they chose a Stateless + Direct Response server over other options!

→ View original post on X — @aymericroucher

10 July 2025

SmolLM3: Powerful built-in tool-calling capabilities

By

@aymericroucher

–

09 July 2025 20h19

Reminder: SmolLM3 comes with built-in tool-calling, and it works really well! pic.twitter.com/uba78UfPtg
— m_ric (@AymericRoucher) 9 juillet 2025

Reminder: SmolLM3 comes with built-in tool-calling, and it works really well!

→ View original post on X — @aymericroucher

9 July 2025

FlashAttention less useful with MLP GEMM latency dominance

By

@aymericroucher

–

09 July 2025 19h13

Maybe FlashAttention is not that useful when you have MLP GEMMs that eat so much latency? Interesting graph in the latest blog post from @gpus_go_brrr!

→ View original post on X — @aymericroucher

9 July 2025

SmolLM3-3B fills Qwen’s Pareto gap via agentic post-training

By

@aymericroucher

–

08 July 2025 18h51

Qwen left a hole in the Pareto frontier of optimal performance for a given size… So we just filled it: introducing SmolLM3-3B I helped the SmolLM team on the "make it agentic" part, by post-training the model on agent traces with @akseljoonas
: the model is now also on

→ View original post on X — @aymericroucher

8 July 2025

Agents too unreliable, use only when no other choice

By

@aymericroucher

–

03 July 2025 11h29

Agents are too unreliable => Use them only when you have no choice! > Don't get me wrong, agentic apps are powerful and unlocks vast fields of previously impossible use cases, but indeed people often try to use thel in uses cases where they don't belong. @hugobowne just

→ View original post on X — @aymericroucher

3 July 2025