LLMS - AI Dynamics

108 real-world computer-use workflows: 20.6% agent completion

By

–

26 June 2026 17h15

108 real-world, long-horizon computer-use workflows. Average rollout: 318 tool calls.

Top frontier agent (Claude Opus 4.8 with max thinking + batched tool calls): 20.6% end-to-end completion (54.8% partial progress). Partial progress is real. Reliable end-to-end computer use is… https://t.co/aK42mNsHdx
— Snorkel AI (@SnorkelAI) 26 juin 2026

108 real-world, long-horizon computer-use workflows. Average rollout: 318 tool calls. Top frontier agent (Claude Opus 4.8 with max thinking + batched tool calls): 20.6% end-to-end completion (54.8% partial progress). Partial progress is real. Reliable end-to-end computer use is

→ View original post on X — @snorkelai

26 June 2026

GLM 5.0 report and IndexShare: only notable aspects

By

@rasbt

–

26 June 2026 16h50

I was thinking about it, but beyond the original GLM 5.0 technical report, I don't think there's anything interesting to write about (except that it's better; and IndexShare, which is a rel simple concept)

→ View original post on X — @rasbt

26 June 2026

Not yet tried Cline & Pi, comfortable with Codex due to muscle memory

By

@rasbt

–

26 June 2026 16h45

No, not yet. There's also Cline & Pi I still have to try. (I am kind of comfortable with codex because of muscle memory.)

→ View original post on X — @rasbt

26 June 2026

Local open-weight LLMs: 30B MoE sweet spot at 40 tok/sec

By

@rasbt

–

26 June 2026 16h42

Have been taking different local open-weight LLMs for a test drive in different harnesses (Qwen-Code, Codex, Claude Code). 30B Mixture-of-Expert models are kind of a nice sweet spot and can solve challenging problems. And they get roughly 40 tok/sec on a Mac or DGX Spark, which

→ View original post on X — @rasbt

26 June 2026

One memory for all 10,000+ notes in structured form for AI agents

By

@whats_ai

–

26 June 2026 16h04

Our talk just got selected as a keynote for the AI Engineer World's Fair 2026. So I want to share what @pauliusztin_ and I built: one memory for all 10,000+ of our notes. Everything I've learned and saved now lives in one structured memory my agents can actually use. It takes

→ View original post on X — @whats_ai

26 June 2026

Chinese sellers offer trillions of Claude tokens at 90% discount

By

@datachaz

–

26 June 2026 15h53

TIL chinese bros are selling trillions of Claude tokens at 90% discount

→ View original post on X — @datachaz

26 June 2026

Anthropic’s Mythos forces DeepSeek into $7.4B fundraising

By

@kimmonismus

–

26 June 2026 15h12

Anthropic’s Mythos preview reportedly pushed DeepSeek into a $7.4B fundraising – because they could not compete with Mythos. Until now, the three-year-old Chinese AI lab had relied on CEO Liang Wenfeng’s personal wealth instead of outside capital. The Information reports the

→ View original post on X — @kimmonismus

26 June 2026

LangSmith LLM Gateway enables flexible budget controls for teams

By

@langchain

–

26 June 2026 15h04

Before we shipped LangSmith LLM Gateway, we rolled it out internally. We don’t have to wait until the end of the month to understand spend We have been able to set budgets set by org, workspace, user, or API key Our teams can flexibly use coding agents without creating

→ View original post on X — @langchain

26 June 2026

HyperExtract: LLM framework converting unstructured text to knowledge

By

@alphasignalai

–

26 June 2026 14h00

/4 HyperExtract turns messy documents into actual knowledge systems. It is an LLM-powered framework for converting unstructured text into strongly typed Knowledge Abstracts. It can extract simple lists, Pydantic models, knowledge graphs, hypergraphs, and spatio-temporal graphs.

→ View original post on X — @alphasignalai

26 June 2026

MinerU parses ugly documents into clean Markdown and JSON for LLM workflows

By

@alphasignalai

–

26 June 2026 14h00

/5 MinerU parses ugly documents into clean Markdown and JSON for LLM workflows. It supports PDFs, DOCX, PPTX, XLSX, images, and web pages through a VLM + OCR dual engine. It can handle scanned docs, handwriting, formulas to LaTeX, tables to HTML, multi-column layouts, and

→ View original post on X — @alphasignalai

26 June 2026