We're sooo back! – Tencent Hunyuan Large – 389B (Total) X 52B (Active) – beats Llama 3.1 405B, Mistral 8x22B, DeepSeek V2! Multilingual, 128K context, Utilizes GQA + CLA for KV Cache compression + Higher throughput Released Pre-train, Instruct & FP8 checkpoints on the Hugging
Tencent Hunyuan Large 389B: New LLM Outperforms Llama DeepSeek
By
–
Leave a Reply