Command A+ sets a new high for Cohere's machine translation capabilities. Opening a clear gap over open source peers Mistral Medium 3.5, DeepSeek, & OpenAI's gpt-oss, as well as Claude Opus 4.6. A+ also outperforms specialist systems like Google Translate. RWS is better… but
@cohere
-

Command A+ Translation Model Outperforms on WMT24++ Benchmark
By
–
The improvements run wide. Across all major European languages, Command A+ consistently pulls ahead of competitors on WMT24++ (xCOMET-XL): +2.4 pts in French +1.9 pts in Spanish +0.9 pts in German Higher translation quality means fewer corrections, stronger retrieval,
-
Cohere Compass for Complex Document Analysis
By
–
Searching through unstructured data, such as scans of handwritten and typed declassified documents, can be challenging. However, with Cohere Compass, it becomes possible because it is designed to process and retrieve information from even the most complex documents. This includes the Compass Visual feature.
-
Cohere Partners with Ottawa Convention Centre Renamed Cohere Centre
By
–
Say hello to the Cohere Centre! We’re proud to partner with Ottawa’s premier convention and event facility as it enters an exciting new chapter. The Cohere Centre will serve as a hub where leaders from business, government, technology, and the community come together right
-

Cohere and Aleph Alpha Form Transatlantic Sovereign AI Powerhouse
By
–
Sovereign AI for the world. Cohere & Aleph Alpha form transatlantic AI powerhouse anchored in Canada & Germany! Combining our global scale with European R&D excellence to build sovereign, enterprise-grade AI. Security, privacy & trust for businesses & governments worldwide.
-
Cohere Hiring ML Systems and Audio Inference Engineers
By
–
Enjoyed the read? If you have deep experience in ML frameworks (training or inference) and love working on problems like these, our team is hiring! ML Systems Engineer, Frameworks & Tooling: https://
jobs.ashbyhq.com/cohere/c99e61c
9-ed92-426d-9711-188dfc0f729f?departmentId=7130c75e-15b8-493c-959f-e9b8ea5c1c09
… Audio Inference Engineer, Model Efficiency: -

AWQ Quantization Optimization for Agentic AI Workloads
By
–
For real agentic workloads (North), short-context calibration wasn't enough. We calibrated AWQ on long internal agentic traces (up to 64k tokens) and added token masking in llm-compressor to exclude repetitive chat templates/tool descriptions from calibration stats. Plus QAD
-

BF16 to FP8 Quantization: Per-Channel Scaling for LLM Accuracy
By
–
The tricky part: naïvely casting BF16 group scales to FP8 dropped the quality. Our fix: quantize scales per-channel (outer vector scaling) + rescale by 1/8 to avoid FP8 clipping. Result: >99.5% of W4A16 accuracy recovered on Command A & Cohere MoE. Paired with a CUTLASS
-

W4A8 Inference Production-Ready Integration in vLLM
By
–
Excited to share our work on production-ready W4A8 inference, now integrated in vLLM! By combining 4-bit weights (low memory) with 8-bit activations (high compute), we hit the sweet spot for both decoding and prefill — up to 58% faster TTFT and 45% faster TPOT vs W4A16 on Hopper.
-
Speculative Decoding Optimization for Mixture of Experts Models
By
–
Get more from speculative decoding in MoE models
