@theahmadosman - AI Dynamics

Training models, the new ‘learn to code’

By

–

07 May 2026 17h24

Training models is gonna become the new learn to code and will be automated to a high degree Novel research is another thing and has already been automated to some extent Synthetic data might be the only thing yet to crack, matter of time Compute is the ONLY TRUE moat

→ View original post on X — @theahmadosman

7 May 2026

Current SoTA LLM Ranking as of May 2026

By

@theahmadosman

–

06 May 2026 9h17

Current SoTA LLMs Ranking GPT 5.5 > Kimi K2.6 > GLM 5.1 > Qwen 3.6 397B > MiniMax M2.7 And yes, Opus 4.7 is slop, I left Claude models out intentionally

→ View original post on X — @theahmadosman

6 May 2026

Cloning Sglang Mini and using Codex with GPT-5.5 to learn inference engines

By

@theahmadosman

–

06 May 2026 8h06

There’s too much alpha in cloning Sglang Mini and asking Codex Cli w/ GPT 5.5 to teach you how Inference Engines work through that cloned repo

→ View original post on X — @theahmadosman

6 May 2026

LLM pricing per token mirrors SaaS era, buy a GPU

By

@theahmadosman

–

06 May 2026 1h40

Charging $ per 1M tokens
Is basically the SaaS era of LLMs That’s why I keep saying Buy a GPU

→ View original post on X — @theahmadosman

6 May 2026

Choose Hardware First, Then the Inference Engine Follows

By

@theahmadosman

–

05 May 2026 23h40

You don’t pick an Inference Engine You pick a Hardware Strategy Then the Engine follows Inference Engines Breakdown (Cheat Sheet at the bottom) > llama.cpp
runs anywhere
CPU, GPU, Mac, weird edge boxes
best when VRAM is tight and RAM is plenty
hybrid offload, GGUF,

→ View original post on X — @theahmadosman

5 May 2026

Free Hugging Face Mirror Offered If Platform Goes Down

By

@theahmadosman

–

05 May 2026 0h52

If @huggingface gets taken down I got your back for free

→ View original post on X — @theahmadosman

5 May 2026

Local LLMs Web Stack Setup With SearXNG Firecrawl and Camofox

By

@theahmadosman

–

04 May 2026 22h47

PRO TIP Using local LLMs? Give them a web stack My setup: – SearXNG: candidate source discovery – Firecrawl: known-URL scraping and crawling – Camofox: browser fallback when JS/interaction gets annoying Search → Extract → Interact Tell your favorite agent to set this up,

→ View original post on X — @theahmadosman

4 May 2026

MLX vs llama.cpp for Inference on Apple Silicon

By

@theahmadosman

–

04 May 2026 22h00

MLX is often better on Apple Silicon from my experience llama.cpp is my fallback method

→ View original post on X — @theahmadosman

4 May 2026

Run Local AI Easily Using Codex CLI and Optimized Inference

By

@theahmadosman

–

04 May 2026 21h51

Let me make local AI easy for you Give Codex Cli the tweet below & tell it: – Infer the right Inference Engine from your hardware + tweet content below
– Use uv+venv
– Pick the right kernels
– Tune flags, batching, KVCache, etc
– Optimize for your hardware & chosen model Enjoy

→ View original post on X — @theahmadosman

4 May 2026