AI Dynamics

Global AI News Aggregator

@jeremyphoward

Facebook Adds LLMs.txt to Developer Documentation

By

@jeremyphoward

–

31 March 2026 4h37

Facebook added it too: developers.facebook.com/llms…

→ View original post on X — @jeremyphoward, 2026-03-31 02:37 UTC

31 March 2026
PyTorch trunc_normal_ initialization bugs in LLM training code

By

@jeremyphoward

–

30 March 2026 17h09

Okay LLM + PyTorch people, trunc_normal_, what the fuck! Many LLM inits use it w/ default cutoffs. It's either not doing anything or it's quite broken due 2 issues. 1. The a/b cutoffs in PyTorch are not in std-devs, they are absolute. So w/ a std=0.02, and -2/2 (default arg) cutoffs that's 100σ!! That is a normal distribution, trun isn't doing anything. 2. There are numerical issues. Even in float32, the truncation produces a handful of -2 (lower cutoff) values, 100σ!! That's incomprehensibly improbable. I doubt a float32 or even float64 algo could even produce it, but clamping a bad float value does. Olmo (@allenai codebases) appear to be one of the few that uses trunc_normal_ and bothered to set the cutoffs properly. It'd be nice to see more train code opened up as a default. We so often only end up with a sanitized version of the inference/fine-tune friendly model these days and may lose details like original init. I've known about #1 for ages, I have an alternate trunc_normal_tf_ implementation in timm for that reason. But I saw those -2's last week when I was debugging something and was a little surprised.

→ View original post on X — @jeremyphoward, 2026-03-30 15:09 UTC

30 March 2026
Crediting Alec for LLM pretraining oversimplifies research history

By

@jeremyphoward

–

29 March 2026 14h21

Alec is a once in a generational researcher, but saying that he invented pretraining is not only a bit of stretch, but it's also a disrespect to other people's work. Flowers ☾ (@flowersslop) Every LLM from any lab today traces back to this guy, who was the only person at OpenAI pushing for pretraining transformer language models. He built GPT-1. After that did others see the potential. He invented it, and almost none of the so called AI experts even know his name. — https://nitter.net/flowersslop/status/2037892926785634720#m

→ View original post on X — @jeremyphoward, 2026-03-29 12:21 UTC

29 March 2026
ULMFiT Transfer Learning: The True Foundation of Modern LLMs

By

@jeremyphoward

–

28 March 2026 21h34

Every LLM from any lab today, including from OpenAI taces back to @jeremyphoward and @seb_ruder with their ULMFiT paper. The breakthrough for LLMs was transfer learning, not Attention. Flowers ☾ (@flowersslop) Every LLM from any lab today traces back to this guy, who was the only person at OpenAI pushing for pretraining transformer language models. He built GPT-1. After that did others see the potential. He invented it, and almost none of the so called AI experts even know his name. — https://nitter.net/flowersslop/status/2037892926785634720#m

→ View original post on X — @jeremyphoward, 2026-03-28 20:34 UTC

28 March 2026
Building Community-Owned AI Agent Data Repository for Independence

By

@jeremyphoward

–

28 March 2026 9h38

we as software engineers are becoming beholden to a handful of well funded corportations. while they are our "friends" now, that may change due to incentives. i'm very uncomfortable with that. i believe we need to band together as a community and create a public, free to use repository of real-world (coding) agent sessions/traces. I want small labs, startups, and tinkerers to have access to the same data the big folks currently gobble up from all of us. So we, as a community, can do what e.g. Cursor does below, and take back a little bit of control again. Who's with me? cursor.com/blog/real-time-rl…

→ View original post on X — @jeremyphoward, 2026-03-28 08:38 UTC

28 March 2026
Anthropic Naming Its Model After Cthulhu Criticized by Yudkowsky

By

@jeremyphoward

–

27 March 2026 19h36

"naming their next model after Cthulhu" 😒 — "Naming their next model after Cthulhu makes it hard to take Anthropic seriously as the good guys. It's fun at any other software company, not one that actually is flirting with extinction." — Eliezer Yudkowsky (@allTheYud) [Translated from EN to English]

→ View original post on X — @jeremyphoward, 2026-03-27 18:36 UTC

27 March 2026
Mojo Kernels: Reducing conv2d Code from 870 to 130 Lines

By

@jeremyphoward

–

27 March 2026 16h00

130 lines instead of 870. That's the difference between our conv2d implementation on Blackwell and CUTLASS's. We broke kernels into three swappable pieces: one for moving data, one for coordinating the pipeline, one for compute. When you need a new kernel, you only change the piece that actually needs to change. Part 3 of our Structured Mojo Kernels series walks through the details: modular.com/blog/structured-…

→ View original post on X — @jeremyphoward, 2026-03-27 15:00 UTC

27 March 2026
Mac Mini with eGPU Support for NVIDIA and AMD

By

@jeremyphoward

–

14 March 2026 5h16

Mac Mini + eGPU. Both NVIDIA and AMD supported.

→ View original post on X — @jeremyphoward, 2026-03-14 04:16 UTC

14 March 2026