AI Dynamics

Global AI News Aggregator

About

1.2B LLM runs at 200 tokens per second in browser

1B model running over 200 tok/s in your browser 👀 Xenova (@xenovacom) Okay, this is actually insane… You can now run LFM2.5-1.2B-Thinking (a 1.2B parameter LLM from @LiquidAI) at over 200 tokens per second directly in your browser on WebGPU! 🤯 Zero install. Fully private. Blazingly fast. Powered by Transformers.js and ONNX Runtime Web — https://nitter.net/xenovacom/status/2026727703836004796#m

→ View original post on X — @maximelabonne, 2026-02-25 18:47 UTC