Special shout out to to our Deep Learning Engineer and blog author @TamerGhattas911 👏
@ai21labs
-

Silent 32-bit Integer Overflow in vLLM Mamba-1 CUDA Kernel
By
–
3/4 Root cause: a silent 32-bit integer overflow inside a vLLM CUDA kernel for Mamba-1 selective scan forward kernel. cache_index * ssm_states_batch_stride overflowed once cache slots got large enough, corrupting writes with no crash and no warning.
-
Integer Overflow Bug Fix in vLLM: uint32_t to size_t
By
–
4/4 The fix upstream? Two characters, basically:uint32_t → size_t Weeks of debugging for one tiny type-width bug. Great reminder that in RL infra, the hardest part is often finding the right boundary around the failure. 👉Blog with the full investigation + the upstream vLLM PR: ai21.com/blog/vllm-cuda-inte…
-

Discovery of a Structured Bug in Periodic Rollouts
By
–
2/4 The weird part: the spikes were periodic. When we increased rollouts per prompt, the spike pattern moved with them. That was the clue that this wasn't "training instability", it was a structured rollout-path bug. [Translated from EN to English]
-

Discovery of a logprob anomaly during Jamba training
By
–
1/4 We hit a strange logprob mismatch while training Jamba 3B with GRPO. Rollout logprobs and training-side recompute should match before any weight update. Ours didn't. That was the canary. 🧵 [Translated from EN to English]
-
AI21Labs découvre débordement uint32 dans kernel CUDA vLLM Mamba-1
By
–
Thanks to @AI21Labs for tracking down a silent uint32 overflow in vLLM's Mamba-1 CUDA kernel and contributing the fix. Root cause: `uint32_t` stride × cache_index overflows silently at scale. Fix merged in #35275. The debugging story is worth a read. 🔗 ai21.com/blog/vllm-cuda-inte…
-

SEAL: From Demo to Production, Concrete AI Engineering
By
–
Most AI demos work. Most AI systems don't. SEAL (@AI21Labs' Solutions Engineering & Architecture Lab) exists to close that gap working with your engineers, not just handing over docs. Sound familiar? Let's talk. [Translated from EN to English]
-
Agent Word Losing Meaning in AI Industry
By
–
1/ "Agent" is becoming a meaningless word 🤨
— 🥑 Yet Another (AI) Yuval (@YuvalinTheDeep) 19 mars 2026
Every LLM pipeline is suddenly "agentic." Every wrapper is an "agent framework." The word has been stretched until it broke. pic.twitter.com/jU6xjCx23I1/ "Agent" is becoming a meaningless word 🤨 Every LLM pipeline is suddenly "agentic." Every wrapper is an "agent framework." The word has been stretched until it broke.
-

AI21 Labs Building AI Operating System Not Just Agents
By
–
2/ I spent an hour with @BarakLenz, CTO at @AI21Labs, and one thesis kept coming back: we're not building agents. We're building an AI Operating System. An OS manages resources, tracks what's running, and decides when to spawn or kill work. That's the bar.
-

AI21 Labs at NVIDIA GTC 26 Booth 3103 Enterprise AI
By
–

Day 2 at @nvidia GTC 26. Stop by booth 3103 to meet the team and learn more about what we’re building for enterprise AI. @NVIDIAAI @NVIDIAAIDev #NVIDIAGTC