AI Dynamics

Global AI News Aggregator

About

LLaMA 3 Extended Context Window Reaches 128K Tokens

It's been a week since LLaMA 3 dropped. In that time, we've:
– extended context from 8K -> 128K
– trained multiple ridiculously performant fine-tunes
– got inference working at 800+ tokens/second If Meta keeps releasing OSS models, closed providers won't be able to compete.

→ View original post on X — @mattshumer_