Thanks! My agent reads 50,000 of the smartest people and companies alive
AGI
-
Why We Don’t Have AGI Yet: Essential Technologies Missing
By
–
Why we don't have AGI yet. Interesting essay about what still needs to be built to get us to AGI. Jyothi Venkat (@jyothiwrites) x.com/i/article/204201579076… — https://nitter.net/jyothiwrites/status/2042023088167252162#m
-
Claude Model Evaluation Bias in Character Assessment
By
–
I'm working on character evals and noticed that Claude would constantly pick itself as #1, so I removed the model names from the judge and changed things.
-
Building Benchmark Factory to Combat Model Overfitting
By
–
As models overfit to benchmarks, @alexgshaw of @LaudeInstitute is thinking about the problem this way: “how can we build the benchmark factory – the machine that other people can use to make their benchmarks – as opposed to just creating our own benchmarks one-by-one?”
— Snorkel AI (@SnorkelAI) 8 avril 2026
Enter… pic.twitter.com/3lre0GRO3GAs models overfit to benchmarks, @alexgshaw of @LaudeInstitute is thinking about the problem this way: “how can we build the benchmark factory – the machine that other people can use to make their benchmarks – as opposed to just creating our own benchmarks one-by-one?” Enter
-
AI Performance Drift and Human Oversight in Future Systems
By
–
Yeah, it's interesting how it drifts and gets lazy over time. Even with an amazing memory and rule set. I guess there still is a role for humans in the future. "Keep your lazy AIs working hard." 🙂
-
Teaching AI by assembling world’s smartest experts
By
–
I taught it by giving it a football stadium of the world's smartest people to study. 🙂
-
AI hasn’t produced a single paperclip yet in 2026
By
–
it's 2026 and ai has not even made a single paperclip
-
Muse Spark enables predictable scaling toward personal superintelligence
By
–
With Muse Spark, we are on a predictable and efficient scaling trajectory. We look forward to sharing increasingly capable models on the path to personal superintelligence soon.
-
Scaling Properties of Muse Spark: Pretraining, RL, and Reasoning
By
–
To build personal superintelligence, our model’s capabilities should scale predictably and efficiently. Below, we share how we study and track Muse Spark’s scaling properties along three axes: pretraining, reinforcement learning, and test-time reasoning. Let’s start with