AI Dynamics

Global AI News Aggregator

MSA Inference Code Officially Open-Sourced, Supporting Million-Level Long Text Memory

MSA's Inference code is officially open-sourced on time 🫡 EverMind (@evermind) A few weeks ago we published our Memory Sparse Attention paper, a new way to give AI models long-term memory that actually works. Today's LLMs/Agents forget. They can only hold so much context before things start falling apart. We built a system that lets a model remember up to 100 million tokens, the length of about a thousand books, and still find the right answer with less than 9% performance loss. On several benchmarks, our 4-billion parameter model even beats RAG systems built on models 58× its size. The idea? Instead of searching a separate database and hoping the right info comes back (that's how RAG works), we built the memory directly into how the model thinks. It learns what to remember and what to ignore, end to end, no separate retrieval pipeline needed. The response to the paper blew us away. Researchers and engineers everywhere asking the same thing: "When can we see the code?" So we got to work, cleaned up the inference code, documented it, and made it ready for the community to dig in. You asked for it. We open-sourced it. github.com/EverMind-AI/MSA [Translated from EN to English]

→ View original post on X — @elliotchen100, 2026-04-03 03:55 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *