@jiqizhixin - AI Dynamics

HarnessAudit Framework Audits LLM Agent Safety Beyond Correct Answers

By

–

17 June 2026 9h31

Are your LLM agents really safe even when they return the correct answer? Researchers from UCSB, UC Berkeley, Stanford, UW-Madison, and Microsoft Research introduce HarnessAudit—a framework that audits full execution trajectories for boundary compliance, resource access, and

→ View original post on X — @jiqizhixin

17 June 2026

MMDesign platform generates nanobody candidates from target protein and binding site

By

@jiqizhixin

–

17 June 2026 6h27

What if you could design a new nanobody from scratch with only tens of lab tests instead of millions? MoleculeMind (led by Jinbo Xu) presents MMDesign – a platform that generates and filters nanobody candidates from just a target protein and desired binding site. The system

→ View original post on X — @jiqizhixin

17 June 2026

Agent-Native Research Artifacts turn papers into executable packages

By

@jiqizhixin

–

17 June 2026 3h25

What if scientific papers were written for AI agents, not humans? Researchers from 25+ top labs (Stanford, MIT, Harvard, Meta, NVIDIA, etc.) introduce Agent-Native Research Artifacts (ARA) – a protocol that replaces narrative papers with executable research packages. ARAs

→ View original post on X — @jiqizhixin

17 June 2026

AI agents learn to predict remaining budget intervals to avoid waste

By

@jiqizhixin

–

16 June 2026 19h20

Can AI agents learn to stop wasting resources on doomed tasks? Researchers from Northwestern, Michigan, Cornell, Stanford & others introduce BAGEN. It trains agents to predict remaining budget intervals and alert users early, instead of blindly overspending. Key results:

→ View original post on X — @jiqizhixin

16 June 2026

W-Flow: Single-step image generation via Wasserstein gradient flow

By

@jiqizhixin

–

15 June 2026 16h41

What if you could generate high-quality images in one step instead of hundreds? Stanford and ByteDance introduce W-Flow: a single-step generator that turns random noise into target data by following a Wasserstein gradient flow. It compresses a full diffusion-like evolution into

→ View original post on X — @jiqizhixin

15 June 2026

Why is Mistral mocked?

By

@jiqizhixin

–

15 June 2026 9h48

Why does everyone mock Mistral?

→ View original post on X — @jiqizhixin

15 June 2026

HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry

By

@jiqizhixin

–

15 June 2026 8h40

HarnessX: A Composable, Adaptive, and Evolvable Agent Harness Foundry Paper: https://
arxiv.org/abs/2606.14249

→ View original post on X — @jiqizhixin

15 June 2026

Xiaomi’s HarnessX lets AI agents redesign their own runtime

By

@jiqizhixin

–

15 June 2026 8h40

What if AI agents could redesign their own runtime on the fly? Darwin Agent Team From Xiaomi introduces HarnessX, a foundry that lets agent harnesses—prompts, tools, memory, and control flow—compose, adapt, and evolve automatically. Instead of hand-crafting scaffolding for

→ View original post on X — @jiqizhixin

15 June 2026

LLaVA-OneVision-2 tokenizes video like codec, focusing on key moments

By

@jiqizhixin

–

15 June 2026 2h39

What if your AI could “see” video like a streaming codec—spending tokens only on the most important moments? Introducing LLaVA-OneVision-2 from Glint Lab, AIM for Health Lab, and MVP Lab. Their secret? Codec-stream tokenization: video is treated as a continuous bit-cost

→ View original post on X — @jiqizhixin

15 June 2026

WorldCache framework accelerates diffusion world models 3.7x faster

By

@jiqizhixin

–

14 June 2026 19h37

What if your AI world model could run 3.7x faster without sacrificing quality? Researchers from the Chinese Academy of Sciences and ETH Zurich introduce WorldCache—a caching framework for diffusion world models. Instead of naively skipping steps, it uses a curvature-guided score

→ View original post on X — @jiqizhixin

14 June 2026