@jxmnop - AI Dynamics - Page 11 of 78

Model Hallucinates Domino Problem Repeatedly in Token Loop

By

–

08 August 2025 21h21

and it truly is a tortured model. here the model hallucinates a programming problem about dominos and attempts to solve it, spending over 30,000 tokens in the process completely unprompted, the model generated and tried to solve this domino problem over 5,000 separate times

→ View original post on X — @jxmnop

8 August 2025

Embedded Generations: AI Model Capabilities in Math and Code

By

@jxmnop

–

08 August 2025 21h21

here's a map of the embedded generations the model loves math and code. i prompt with nothing and yet it always reasons. it just talks about math and code, and mostly in English math – probability, ML, PDEs, topology, diffeq
code – agentic software, competitive programming,

→ View original post on X — @jxmnop

8 August 2025

GPT-OSS Training Data Analysis: Bizarre Results Revealed

By

@jxmnop

–

08 August 2025 21h21

curious about the training data of OpenAI's new gpt-oss models? i was too. so i generated 10M examples from gpt-oss-20b, ran some analysis, and the results were… pretty bizarre time for a deep dive

→ View original post on X — @jxmnop

8 August 2025

GPT-5 Scaling Laws: Diminishing Returns on General Intelligence

By

@jxmnop

–

08 August 2025 2h30

shortest explanation of GPT-5: this is exactly what the scaling laws predicted! the model is better, the returns are diminishing, and sadly absolute general intelligence improvements will only get smaller the good news is there’s so much still to do. personality, reasoning,

→ View original post on X — @jxmnop

8 August 2025

Four Years Minimum Wage Career Path Challenges

By

@jxmnop

–

08 August 2025 0h44

no, it requires four years of minimum wage and perpetual headache

→ View original post on X — @jxmnop

8 August 2025

PhD holders unlikely to produce flawed data visualizations

By

@jxmnop

–

07 August 2025 23h07

if they have a phd then there’s no way they would’ve made this graph after two phds

→ View original post on X — @jxmnop

7 August 2025

PhD-level rigor in data visualization and research standards

By

@jxmnop

–

07 August 2025 23h00

people arent gonna wanna hear this but i truly do not believe this mistake could’ve been made by someone with a phd. after going through brutal peer review several times you just stop doing stuff like this. whoever made this graph clearly has a bachelors degree. maybe a masters

→ View original post on X — @jxmnop

7 August 2025

You Might Be AI: A Humorous Reflection

By

@jxmnop

–

07 August 2025 22h41

okay now i think you might be AI

→ View original post on X — @jxmnop

7 August 2025

Python and SWE Bench: Comparing Claude and ChatGPT Performance

By

@jxmnop

–

07 August 2025 22h37

python is great also i love swe bench btw just might not be the highest signal point of comparison between Claude and chatGPT these days

→ View original post on X — @jxmnop

7 August 2025

SWEbench Scores Inflated by Django Training Data Bias

By

@jxmnop

–

07 August 2025 22h23

good time to remind everyone that a high score on SWEbench really just indicates the training data contained a sufficiently large proportion of Django

→ View original post on X — @jxmnop

7 August 2025