This new paper from Alibaba Group makes context management into a learnable RL policy So the agent is capable of deciding when to store, update, or delete long-term info and when to retrieve, summarize, and filter short-term context bringing a fresh new way to tackle LLMs used
Alibaba’s Learnable RL Policy for LLM Context Management
By
–
