Worse, they imply other scaling laws should rather be fit to sigmoids. The only reason we're seeing "exponential" looking growth is because of human ingenuity stacking the sigmoid curves! (Great time to be in fundamental research rather than blindly following corporate trends.)
@alexjc
-

RL on LLMs Hits Scaling Ceiling at 61% Performance
By
–
It's hard to overstate how devastating this paper is, not only for reinforcement learning. They spent $4m of compute to find out that RL on LLMs basically taps out at 61% "asymptotic pass rate" (exact rate depends on context), but they built a *ceiling* into the scaling law!
-
Claude vs GPT-5: Performance Degradation in Extended Conversations
By
–
Claude struggled a bit with syntax errors after a long chat, uncharacteristic, but complexity did build up. Whereas GPT-5 just seems to stop working after a while though, it thinks and then says what it'd do without doing it.
-
Cursor AI Token Limits: Smart Summarization Over Extended Context
By
–
The recent @cursor_ai updates to summarize your chat when you hit token limits prove you don't really need a long context, just solid problem-solving. The chat about my project is a long list of related things to solve and doesn't need stop…
-
Vibe Coding vs Sloptimizing: Programming Quality Debate
By
–
Who called it vibe coding and not sloptimizing?
-
Should AI-Generated Content Require a .slop File Extension?
By
–
Do we need a .slop file extension as suffix to inform (and warn) users the content has not been reviewed or approved by a human? .jpg.slop
.docx.slop
.py.slop You'd have to manually remove the .slop as an explicit acknowledgement of the risk! -

Functional Stack Languages Joy and Factor Explored in Stanford Slides
By
–
Fascinating set of slides by Jon Purdy about functional stack languages such as Joy and Factor. https://
web.stanford.edu/class/ee380/Ab
stracts/171115-slides.pdf
… -
Claude Update Impact on Developer Understanding and Code Workflow
By
–
Felt it too since the last Claude update. My daily work (via Cursor) significantly changed… Hardest part these days is making the executive decision: do I even need to understand how it works? For throw-away scripts or tools with verifiable outputs it's increasingly tough!
-
Simplifying RL Training: Beyond Enterprise Dependencies
By
–
Luckily, codebases like NanoGPT or the speedrun completely reset people's expectations. But I couldn't find the same for RL; just training Qwen3-4b locally requires hundreds of entreprise-y packages, many of which glitch out when you get slightly off the beaten path…
-
Unsloth Monkey-Patching Causes Distributed Training Deadlock Issues
By
–
Eventually tracked down the problem to the Unsloth monkey-patching. Distributed runs even with one process fail, but if you use the regular models from transformers and `pef`, trained with `trl` then it doesn't deadlock… Hmm.