Top Models Per Use Case Coding – Codex 5.3 – SeeDance 2.0! now generally available Voice – Gemini Flash Live Image – Nano Banano Pro Everyday use – GPT 5.4 Claw – m2.7 Fast – Grok 4.2 Agentic – Opus 4.6 Cheap agentic – GLM 5.1 OCR – Gemini Flash Codex is actually very good… realizing this a bit too late
PROMPT ENGINEERING
-
PLINY.TV: AI-Curated YouTube Discovery from Model Memory
By
–
INTRODUCING: https://t.co/vNczaF0SqH! 🐉📺
— Pliny the Liberator 🐉󠅫󠄼󠄿󠅆󠄵󠄐󠅀󠄼󠄹󠄾󠅉󠅭 (@elder_plinius) 9 avril 2026
this one's a little different than my usual project, but i hope you enjoy this foray! 🫶
the idea? AI-curated content… straight from the latent space!
as it turns out, LLMs have been training on lots and LOTS of embedding links 🧐… pic.twitter.com/lMnyrnGBgRINTRODUCING: PLINY.TV! 🐉📺 this one's a little different than my usual project, but i hope you enjoy this foray! 🫶 the idea? AI-curated content… straight from the latent space! as it turns out, LLMs have been training on lots and LOTS of embedding links 🧐 so in the Pliny TV backend, there is no search API. no YouTube Data API key. no vector DB. no scraper. discovery happens entirely inside a language model's memory of the internet. you describe a channel in plain English — "music videos from 200", "cozy 3am lofi rain", "cursed liminal space docs", — and an AI curator recalls real YouTube video IDs directly from its weights! the system prompt literally says: "you have been trained on millions of YouTube videos — USE THAT KNOWLEDGE." we then hit YouTube's oEmbed as a reality check. survivors make it to the channel queue, hallucinations get filtered out. the LLM is the search engine. the index is the model. because there's no keyword index, no recency bias, no algorithm optimizing for watch time, the curator doesn't search. you give it a vibe and it navigates the probability space of that concept, pulling back whatever is most salient. which turns out to be, almost without fail, the most nostalgic, cultishly-loved, collectively-imprinted videos in that region of latent space. the memes you forgot you'd forgotten! you ask for a vibe, you get a tour of the collective unconscious, sorted by resonance. less search engine, more akashic record. and one of the most fascinating elements is seeing how this effect differs between models! different curators produce vastly different generated playlists, even with the same prompt. what's in the box? 📡 create channels & publish them to public so anyone can tune in, plus an option to have AI auto-generate an idea 🤖 AI curator auto-queues 15–20 videos per run, auto-shutoff at 20min to save API cost 👥 AI personas watch with you and chat live — custom personas generated from your channel's vibe 🎨 110+ preset themes — each one a full visual universe, chosen by the model based on your prompt 🎬 auto-skip on dead videos, shuffle, request queue, fallback tiers when the AI whiffs 📱 retro flip phone UI — dial channels by number, unlock secret speakeasy rooms with hidden codes (420, 1337, etc.), DTMF tones and all you're not searching YouTube. you're asking an AI what it remembers. and what it remembers is what the internet collectively loved the most. video streaming as dream logic. pliny.tv 🐉📺
→ View original post on X — @scobleizer, 2026-04-09 20:33 UTC
-
AI Capability Gap: Free Models vs Frontier Agentic Systems
By
–
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to the capabilities, their slope, and various cyber-related repercussions. TLDR the people in these two groups are speaking past each other. It really is simultaneously the case that OpenAI's free and I think slightly orphaned (?) "Advanced Voice Mode" will fumble the dumbest questions in your Instagram's reels and *at the same time*, OpenAI's highest-tier and paid Codex model will go off for 1 hour to coherently restructure an entire code base, or find and exploit vulnerabilities in computer systems. This part really works and has made dramatic strides because 2 properties: 1) these domains offer explicit reward functions that are verifiable meaning they are easily amenable to reinforcement learning training (e.g. unit tests passed yes or no, in contrast to writing, which is much harder to explicitly judge), but also 2) they are a lot more valuable in b2b settings, meaning that the biggest fraction of the team is focused on improving them. So here we are. staysaasy (@staysaasy) The degree to which you are awed by AI is perfectly correlated with how much you use AI to code. — https://nitter.net/staysaasy/status/2042063369432183238#m
-
Sonnet Calling Opus Improves Performance and Reduces Costs
By
–
Allowing Sonnet to "phone a friend" (i.e. call Opus) increases performance while also reducing total cost since it reduces tokens spent trying to solve more complex tasks
-
Share MCP Tool Config in Cursor and Implementation
By
–
Can you please share the MCP tool config in Cursor + the underlying MCP implementation?
-

OpenClaw: Anthropic’s New Advisor-Executor Strategy for Claude
By
–
They’ve built OpenClaw. https://t.co/3f786WRiZj pic.twitter.com/csMtEDlzG4
— Vadim (@VadimStrizheus) 9 avril 2026They’ve built OpenClaw. Claude (@claudeai) We're bringing the advisor strategy to the Claude Platform. Pair Opus as an advisor with Sonnet or Haiku as an executor, and get near Opus-level intelligence in your agents at a fraction of the cost. — https://nitter.net/claudeai/status/2042308622181339453#m
→ View original post on X — @ceobillionaire, 2026-04-09 18:29 UTC
-
Hermes Agent Recommended for Beginners Over Openclaw
By
–
Honestly, I'm using both, but if you're just getting started and haven't used Openclaw before, Hermes agent is a great place to start
-
Hermes Agent Recommended for Beginners Over Openclaw
By
–
Honestly, I'm using both, but if you're just getting started and haven't used Openclaw before, Hermes agent is a great place to start
-
AI Workflow for Building VR on the Web Without Code
By
–
We shipped a fully integrated AI workflow for building VR on the web.
— Meta Horizon Developers (@MetaHorizonDevs) 9 avril 2026
Just describe what you want. AI builds it, tests it, and fixes bugs without you touching the code.
Try it yourself here 👉 https://t.co/wMkVEjWT6V
Discover how it works 🧵👇 pic.twitter.com/GYHZqtk6LdWe shipped a fully integrated AI workflow for building VR on the web. Just describe what you want. AI builds it, tests it, and fixes bugs without you touching the code. Try it yourself here 👉 bit.ly/4czvxUT Discover how it works 🧵👇
→ View original post on X — @scobleizer, 2026-04-09 17:51 UTC
-

Give Claude Eyes: Screenshot Skill for Claude Code
By
–
Give me one minute, and I’ll improve your Claude Code experience immediately. This is the first skill I built. And it’s the skill I use most often. *drumroll* It’s a SCREENSHOT skill. And honestly, I’m shocked Anthropic hasn’t built this functionality into Claude Code itself. Claude has access 🔑 But Claude needs EYES 👁️ Here’s what you’re going to do: 1) locate what folder all your screenshots go to (and if it’s your desktop, you’re a maniac, change it). Mine goes to a folder on my desktop called “organized screenshots” 2) prompt Claude Code with the following: Build me a skill called ‘/ss’ that lists out the files in <screenshots folder path> from newest to oldest, and grabs the newest. This is how I will speak to you visually. I also want an argument for the screenshot count – if I type ‘/ss 4’, you should grab the four most recent screenshots in that folder. If I type no number after ‘ss’ then only grab the most recent screenshot. Then, whatever follows after that argument is the action I want you to take. ‘/ss huh’ means I need you to explain the screenshots’ content to me. ‘/ss 3 make infographic plz’ means I need you to grab the last 3 screenshots and use their content to make me a unified infographic. ‘/ss fix’ likely means that I’m screenshotting an error message in code we’re building out and I need you to understand the error message, figure out the bug, and edit the code to fix it. Or, if we’re in the middle of a front end design project, it might mean the design has an error (like overlapping text) to fix. ‘/ss do this’ likely means that I screenshotted a smart thing someone did online and I want us to learn from it and do the same and remix it so it’s the most goal-oriented outcome for me based on what you know about me 3) let it build you the skill 4) go on X 5) scroll through your feed and screenshot one thing you find valuable 6) open a new terminal and prompt Claude with “/ss” + “do this” or “explain” or “turn this into an infographic” 7) enjoy – you just gave Claude eyes 🎉 Let me know how it goes. Again, this is my most used Claude Code skill by a landslide and easily saves me an hour a week. Cc @bcherny @trq212
→ View original post on X — @alliekmiller, 2026-04-09 16:33 UTC