the main 'thought cycle' practiced during a PhD: – identify region of problem space that lacks coherence
– set aside time to actually think about it
– emerge with a concrete thought advancement’, something you hadn't realized before optimize this process and become powerful
@jxmnop
-
Optimizing the PhD Thought Cycle for Intellectual Breakthroughs
By
–
-
Model Communication Protocol: From Text-Based Design to Dominance
By
–
Model Communication is going to happen slowly, and then all at once:
— dr. jack morris (@jxmnop) 6 août 2025
Level 0: We design a text-based protocol for models & programs to communicate (read: MCP)
Level 1: Text-based model communication grows to exceed human communication in worldwide internet bitstream volume… pic.twitter.com/3vP6yPBmpTModel Communication is going to happen slowly, and then all at once: Level 0: We design a text-based protocol for models & programs to communicate (read: MCP)
Level 1: Text-based model communication grows to exceed human communication in worldwide internet bitstream volume -
Future SOTA AI Models Training on Massive Token Sequences
By
–
clearly in five years SOTA AI models will train on a single string containing approximately 2^21 tokens
-
Training Methods Critical for AI Model Performance
By
–
well they have to train it like this or it wouldn't work nearly as well
-
AI Datasets Shift: Fewer Examples, Longer Sequences
By
–
an interesting trend in AI is that the best datasets have fewer and fewer longer and longer sequences dataset five years ago:
~10^5 examples, each of 2^6 tokens nowadays:
~10^3 examples, each of 2^15 tokens it’s actually more data. but the tokens are stacked horizontally now -
Where Are Encoder-Decoders Being Used in AI?
By
–
oh where have you been seeing encoder-decoders come up?
-

RETRO: Revisiting DeepMind’s Knowledge Outsourcing Architecture
By
–
RETRO (DeepMind, 2021) is a beautiful idea, one badly in need of revisiting the central innovation of retro is to have a small model decide what token to predict next, but outsource all knowledge to a large offline datastore this has the added benefit of allowing you to insert
-
Open Models Cannot Match This Capability Yet
By
–
we don't have any open model that produces anything remotely like this and that is my point
-
DeepSeek outputs strike different compared to other AI models
By
–
yes me too! but it's still so different than the deepseek outputs, so striking
-

OpenAI vs DeepSeek: AI Reasoning Race Accelerates Dramatically
By
–
these reasoning traces have been keeping me up at night on the left: new OpenAI model that got IMO gold
on the right: DeepSeek R1 on a random math problem you need to realize that since last year academia has produced over a THOUSAND papers on reasoning (probably much more).