6. The Markovian Thinker
— DAIR.AI (@dair_ai) 12 octobre 2025
A new RL thinking environment that keeps an LLM’s effective state constant by chunking long chains of thought and carrying over only a short textual state between chunks.https://t.co/mqXfF1XG6h
6. The Markovian Thinker A new RL thinking environment that keeps an LLM’s effective state constant by chunking long chains of thought and carrying over only a short textual state between chunks.