@askalphaxiv - AI Dynamics

Reasoning in Latent States and Tokens for Transformers

By

–

23 June 2026 19h18

What if you taught transformers to reason in both latent states and tokens? This Microsoft paper adds a self-supervised objective of predicting the next latent state to the standard next-token training, where a lightweight dynamic model

→ View original post on X — @askalphaxiv

23 June 2026

OpenRouter to solve GPU shortage in research

By

@askalphaxiv

–

23 June 2026 17h03

Tired of seeing “out of capacity” while looking for GPUs? Introducing OpenRouter for compute

In the past 6 months, overwhelming demand for compute has made pricing and availability huge challenges for the research community

We aggregate compute offerings from across the… pic.twitter.com/Z3IdyPvbkm
— alphaXiv (@askalphaxiv) 23 juin 2026

Tired of seeing "capacity exhausted" when searching for GPUs? Introducing OpenRouter for Compute Over the past 6 months, overwhelming demand for compute has made pricing and availability real challenges for the research community We aggregate

→ View original post on X — @askalphaxiv

23 June 2026

Sparrow: Sparse Deployment for Long-Context RL LLM

By

@askalphaxiv

–

22 June 2026 18h13

Long-context RLVR is powerful, but most of the cost comes from generating massive CoT deployments. Sparse attention would make it faster, that's it.

→ View original post on X — @askalphaxiv

22 June 2026

VIMPO derives value function from policy to improve GRPO

By

@askalphaxiv

–

22 June 2026 18h10

"VIMPO: Value-Implicit Policy Optimization for LLMs" While GRPO is simple because it avoids a critic, it still gives every token in a reasoning trace the same reward signal. This paper tries to get the best of both worlds by deriving the value function from the policy itself

→ View original post on X — @askalphaxiv

22 June 2026

Play with SkyRL via OpenResearch.sh or autoarxiv with GLM 5.2

By

@askalphaxiv

–

22 June 2026 17h05

4/4: You can play with this yourself! Visit http://
OpenResearch.sh (
http://
openresearch.sh) or change ‘arxiv’ to ‘autoarxiv’ on the official SkyRL paper https://
autoarxiv.org/abs/2511.16108 and use the GLM 5.2 model to iterate on the repo!

→ View original post on X — @askalphaxiv

22 June 2026

GLM 5.2 lacks image understanding; uses numpy for WandB charts

By

@askalphaxiv

–

22 June 2026 17h05

3/4: One limitation worth noting: GLM 5.2 has no image understanding. While Opus and Fable can consistently identify trends in WandB charts, GLM resorts to writing numpy code to smooth and clean the raw WandB numbers before analyzing. For simpler runs like this example this is

→ View original post on X — @askalphaxiv

22 June 2026

GLM 5.2 agent ablation demos on continual learning papers

By

@askalphaxiv

–

22 June 2026 17h05

2/4: We’ll be sharing a couple other fun and more complex demos this week where the GLM 5.2 agent conducts ablations on recent continual learning papers like SDPO. This can hopefully give you a sense of what these models can and cannot do when it comes to assisting in the

→ View original post on X — @askalphaxiv

22 June 2026

SkyRL async RL training with autonomous research agent

By

@askalphaxiv

–

22 June 2026 17h05

1/4: A couple notes on the implementation. The async RL training itself is powered by SkyRL, with the research agent’s goal being resolving setup issues (in this case a libnuma dependency) and analyzing runs autonomously.

→ View original post on X — @askalphaxiv

22 June 2026

GLM 5.2: First high-performance open-weights model for auto-research

By

@askalphaxiv

–

22 June 2026 17h05

Introducing GLM 5.2 for autoresearch

GLM 5.2 is the first open weights model we've tried on our autoresearch pipeline that's proven capable for real research tasks.

With Fable 5's restrictions on research, having an open weights alternative is a huge win for open source

Watch… pic.twitter.com/y0kBtJzj5K
— alphaXiv (@askalphaxiv) 22 juin 2026

Introducing GLM 5.2 for auto-research. GLM 5.2 is the first open-weights model we tested on our auto-research pipeline that proved capable for real research tasks. With Fable 5's restrictions on research, having a

→ View original post on X — @askalphaxiv

22 June 2026

Question on DiffusionGemma’s transparency and hidden reasoning

By

@askalphaxiv

–

22 June 2026 2h07

What is the transparency of DiffusionGemma? Given how diffusion language models (Diffusion LMs) denoise tokens instead of generating them left to right, there is a concern about how reasoning will be hidden in

→ View original post on X — @askalphaxiv

22 June 2026