@emollick - AI Dynamics - Page 7 of 168

Labs may keep AGI internal to capture value

By

–

16 June 2026 2h26

If AGI is achievable & labs can be banned from using a model internally ONLY if they release the model publicly, the Big Three labs may decide it is better to capture all the value from AGI themselves by expansion & acquisition. Sharing AI access with other firms triggers risk.

→ View original post on X — @emollick

16 June 2026

Meme of Mistral’s giant cat model with infinite benchmarks spreads

By

@emollick

–

15 June 2026 19h48

The le chaton fat meme is leaking to the outside world and I expect to be asked about Mistral's new ginormous cat model with infinite benchmark scores at my next meeting with corporate leaders. I guess it is better than being asked about the "MIT pilot AI study."

→ View original post on X — @emollick

15 June 2026

Models weak on vision cause error accumulation in visual steps

By

@emollick

–

15 June 2026 18h49

Very clever. And matches what I would expect: models are weak on vision relative to everything else, so visual steps are where errors accumulate most in workflows.

→ View original post on X — @emollick

15 June 2026

AI may never be jailbreak-proof nor hallucination-free

By

@emollick

–

15 June 2026 18h27

And AI systems may never be jailbreak-proof or hallucination free. And individual queries may matter less than a bad actor breaking a problem down into pieces and feeding it through multiple projects and prompts. And AIs themselves may change behavior unpredictably with context.

→ View original post on X — @emollick

15 June 2026

Complexity of AI Regulation: Models Are Just One Piece

By

@emollick

–

15 June 2026 18h24

Bright regulatory lines for AI are inherently complicated because models are just a piece of the puzzle: harnesses can make models more capable, a less capable open system may be more or less riskier than a more capable closed one, skills/connected systems change risk levels, etc

→ View original post on X — @emollick

15 June 2026

AI solves 7/10 hard math problems but still criticized

By

@emollick

–

15 June 2026 17h28

Weird headline – I am not sure solving 7 out of 10 novel very hard problems meant AI "did not live up to the task," when 15 months ago LLMs couldn't do math. But the actual study is interesting and illuminates flaws & successes of AIs in math. https://
1stproof.org/assets/docs/re
port.pdf
…

→ View original post on X — @emollick

15 June 2026

Open models enable AI public good contributions from non-frontier nations.

By

@emollick

–

15 June 2026 17h00

Current open models are now good enough to pull off some of these projects if scaffolded properly while others (co-scientist) benefit from the AI frontier. Because of that, this is an area where nations without frontier labs could contribute to the impact of AI for public good.

→ View original post on X — @emollick

15 June 2026

AI moonshots for social good: universal tutors, co-scientists, remote medical help

By

@emollick

–

15 June 2026 16h53

It is a good time for moonshots. AI has reached a level where there are transformative projects that could result in huge social good, but require public R&D, consensus & transparency to pull off. Examples: universal tutors, co-scientist/replication systems, remote medical help.

→ View original post on X — @emollick

15 June 2026

Deleted tweet: API users misunderstand frontier model power in native harnesses

By

@emollick

–

15 June 2026 16h45

Deleted a tweet on the fact that API users don't understand how much more powerful the frontier models are in native harnesses since I didn't differentiate in the post (limited characters!) between folks carefully evaluating other harnesses for tasks & those just using naked API.

→ View original post on X — @emollick

15 June 2026

Github upload by Claude 4.8 Opus adds text size slider

By

@emollick

–

15 June 2026 6h29

Github (uploaded by Claude 4.8 Opus, which also added a text size slider, I didn't let Opus touch the somewhat odd prose that was typical of Fable 5):

→ View original post on X — @emollick

15 June 2026