"Memory Caching: RNNs with Growing Memory" Google's new paper proposes a simple way to give recurrent models a memory that grows with sequence length. So instead of forcing an RNN to compress the full past into 1 fixed hidden state, it caches memory checkpoints across
@askalphaxiv
-
Malicious Intermediary Attacks on LLM Agent Supply Chain Security
By
–
"Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain" The danger of agent security can also exist in the API router, that is between the agent and the provider. As these routers can read prompts, keys, and tool calls in plaintext, even rewrite
-
SkillClaw: Collective Skill Learning Through Agentic Evolution
By
–
"SkillClaw: Let Skills Evolve Collectively with Agentic Evolver" As most AI agents still relearn the same targets from scratch, SkillClaw turns it into shared learning, where it collects agent trajectories across users, groups them by skill, and uses an agentic evolver to spot
-
Interleaved Head Attention Improves Transformer Architecture
By
–
“Interleaved Head Attention” A core limitation of transformers is that standard attention gives you H isolated heads, which means you only get H independent attention patterns. So this paper lets heads mix before attention by creating pseudo-heads from learned combinations of
-
New Scientific Publication on alphaxiv
By
–
read more: https://alphaxiv.org/abs/2604.02268 [Translated from EN to English]
→ View original post on X — @askalphaxiv, 2026-04-07 07:37 UTC
-

SKILL0: Training Agents to Internalize Skills Without Context
By
–
“SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization” Most agent systems use skills like cheat sheets. It retrieves them at runtime, pastes them into the prompt, and hopes the model follows them. This paper suggests why not train the model with those skills, then slowly remove them until it can do the job from memory? So the agent starts training with skill guidance, but over time the helpful skills are taken away. And instead of depending on instructions forever, it learns to absorb them into its own parameters. This turns skills from something the model reads into something the model actually knows, and the result is a more efficient agent with much less context overhead, but still better performance. Empirically, SKILL0 beats strong RL baselines on ALFWorld and Search-QA while using under 0.5k tokens per step.
→ View original post on X — @askalphaxiv, 2026-04-07 07:37 UTC
-
AlphaXIV shares list of suggested papers to implement
By
–
Check out our list of suggested papers to implement here! alphaxiv.org/shared/folder/0…
→ View original post on X — @askalphaxiv, 2026-04-06 19:38 UTC
-
Competition: Replicate Frontier Research with marimo Notebooks
By
–
The best way to learn frontier research is to replicate it yourself. And now, you can also win prizes for that! We are excited to announce our partnership with @marimo_io for a competition to bring research to life. All you have to do is pick a paper, build a marimo notebook that brings the core idea to life, and experiment with the research topic. Prizes: Mac Mini + $500! 👀 Deadline: April 26, 11:59 PM PST Individual and team submissions are all welcome Full details found below 👇
→ View original post on X — @askalphaxiv, 2026-04-06 19:38 UTC
-
alphaXiv and marimo Notebook Competition with Mac Mini Prize
By
–
We partnered with @askalphaxiv on a competition. Pick a paper, build a marimo notebook that brings the core idea to life, and become one of the few experts on your research topic. Oh, and did we mention you can win a Mac Mini + $1K in prizes? 👀 Full details found here: marimo.io/pages/events/noteb…
→ View original post on X — @askalphaxiv, 2026-04-06 18:17 UTC
-
New scientific article available on AlphaXIV
By
–
read more:
https://www.alphaxiv.org/abs/2604.01411 [Translated from EN to English]→ View original post on X — @askalphaxiv, 2026-04-06 18:08 UTC