AI Dynamics

Global AI News Aggregator

1.2B LLM runs at 200 tokens per second in browser

1B model running over 200 tok/s in your browser πŸ‘€ Xenova (@xenovacom) Okay, this is actually insane… You can now run LFM2.5-1.2B-Thinking (a 1.2B parameter LLM from @LiquidAI) at over 200 tokens per second directly in your browser on WebGPU! 🀯 Zero install. Fully private. Blazingly fast. Powered by Transformers.js and ONNX Runtime Web β€” https://nitter.net/xenovacom/status/2026727703836004796#m

β†’ View original post on X β€” @maximelabonne, 2026-02-25 18:47 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *