@jxmnop - AI Dynamics - Page 5 of 78

LLAMA 5 Vision: Dense Multimodal Model Across Four Sizes

By

–

30 September 2025 21h23

if it were up to me, there would be a LLAMA 5 it would be dense. it would be multimodal. it would come in four sizes: 1B, 8B, 80B, 800B. it would be iterated upon until the 8B LLAMA 5 totally outperformed the 80B LLAMA 3. this would be a huge asset to the community.

→ View original post on X — @jxmnop

30 September 2025

Learning Bits Through RL and SFT: Research Insights

By

@jxmnop

–

30 September 2025 1h22

best paper or blog i've read in a while, highly recommend! John is brilliant and his research sets an example for the rest of us. recently i too have been thinking deeply about how many bits might be learned via one step of RL or SFT.. if you're thinking about this too, lmk!

→ View original post on X — @jxmnop

30 September 2025

AI Progress, Evals, and Reinforcement Learning in Job Market

By

@jxmnop

–

26 September 2025 18h11

quite a privilege to be a guest on Odd Lots! talked w joe & tracy about pace of progress in AI, broken evals, the crazy job market, and the difference between supervised and reinforcement learning

→ View original post on X — @jxmnop

26 September 2025

Transformers Need 56.9B Parameters to Memorize Wikipedia

By

@jxmnop

–

25 September 2025 20h34

by the way. recently wrote a paper on this! for transformers, the number is about 3.6 bits-per-parameter so you would need 25GB ÷ 3.6 bits ≈ 56.9B parameters to exactly memorize Wikipedia that’s a pretty big model actually

→ View original post on X — @jxmnop

25 September 2025

RoPE Encoding: Technical Discussion on Language Models

By

@jxmnop

–

24 September 2025 2h30

these models don’t use RoPE but you’re probably directionally correct

→ View original post on X — @jxmnop

24 September 2025

AI Models Limited by Missing Training Data for New Hardware

By

@jxmnop

–

23 September 2025 17h01

i could definitely be wrong, but my thinking is that models are very good at doing problems that have similar solutions in their training data at the time of release for every new line of GPUs, there is no training data available

→ View original post on X — @jxmnop

23 September 2025

Kernel Writing Skills Command Premium Salaries in AI Job Market

By

@jxmnop

–

23 September 2025 16h41

out of all my AI PhD friends on the job market this year, the ones that did the best (by far) write kernels. companies are paying out the wazoo for this skillset models approaching human-level performance in kernel-writing over the next year looks pretty unlikely

→ View original post on X — @jxmnop

23 September 2025

Human-Only Social Networks: The Bot Spam Solution

By

@jxmnop

–

10 September 2025 23h28

the trendlines indicate that one day this app will be overrun by hordes of bots producing low quality difficult-to-identify slop and at this point someone will be forced to build the first true Humans-Only social network. No Bots Allowed. Fingerprint Required

→ View original post on X — @jxmnop

10 September 2025

Do Humans Learn Through Adversarial Methods?

By

@jxmnop

–

10 September 2025 21h36

humans learn adversarially? this is the first im hearing of this

→ View original post on X — @jxmnop

10 September 2025

Understanding AI Through the Lens of Compression

By

@jxmnop

–

09 September 2025 2h43

nearly everything in AI can be understood through the lens of compression – the architecture is just schema for when & how to compress
– optimization is a compression *process*, with its own compression level and duration
– (architecture + data + optimization) = model
– in other

→ View original post on X — @jxmnop

9 September 2025