Note: Qwn3-Max is not open weights and is quite large (and seemingly expensive) model.
@emollick
-
AI Adoption Barriers: Organizational Change Over Technical Limits
By
–
I suspect current models are “smart” enough to transform a lot of work, despite their jaggedness. The barriers are less AI ability now than the fact that there is a learning curve of how to use them in organizations, along with the need for organizational change & new processes.
-
Qwen3-Max Impresses as Non-Reasoning Model
By
–
So far, Qwen3-Max seems impressive for a non-reasoning model, doing a good job at a lot of my weird tests that even some reasoners struggle with.
-
AI Economic Impact: Plow Analogy and Technological Disruption
By
–
That's very funny. Two things to push back on: 1) Plow development absolutely reshaped the human economy and displaced work, but it happened over a long time (moldboard plows came late, etc.) as tech development was slow 2) Whether AI is a normal technology is open to debate
-
AGI Economic Implications: Labor Displacement and Human Work Value
By
–
Some new theoretical economics papers looking at the implications of AGI. These two papers argue that a true AGI-level AI (equivalent to a human genius), if achieved, would eventually displace most human labor and reduce the economic value of remaining human work to near-zero.
-
Gemini 2.5 Pro Performance Assessment Among Fast Smaller Models
By
–
It doesn't feel Gemini 2.5 Pro level, despite the benchmarks, but it stands out among the fast smaller models.
-
Grok 4 Fast Excels at Creative Coding Challenges Over Language
By
–
The new Grok 4 Fast seems to do well with creative coding challenges ("create a visually interesting shader that can run in twigl, make it like the ocean in a storm", "make a futuristic starship panel for p5js") for a small model but not quite as great in creative language tests pic.twitter.com/C1HZtSNbKP
— Ethan Mollick (@emollick) 20 septembre 2025The new Grok 4 Fast seems to do well with creative coding challenges ("create a visually interesting shader that can run in twigl, make it like the ocean in a storm", "make a futuristic starship panel for p5js") for a small model but not quite as great in creative language tests
-
Generalist AI Limitations and Opportunities in Coaching and Simulation
By
–
A good rule of thumb is that generalist AI models are less likely to be useful when being “a friendly assistant” is unhelpful. It is why there is opportunity (for now, at least) for new approaches in coaching & teaching (some adversarial elements) and digital twins & simulation.
-
Economists Examine Transformative AI Impact on Firms and Society
By
–
Nice thread of talk summaries from papers presented by economists & economic sociologists at a conference taking the prospect of transformative AI seriously. Lots of smart people doing interesting work to try and understand what this new era means for firms, science & all of us.
-
LMArena Benchmarking Gamed by Sycophantic AI Responses
By
–
It is kind of amazing how many good benchmarking tools have been saturated or gamed (whether by accident or on purpose) in the past few months. LMArena really seemed like a good method, but then it turned out that you could just go full syncophantic and people loved it.