On Stable Beluga 2.5 70B, our fine-tuned version of Llama 2 70B, Gaudi 2 achieved 28% faster inference speed for tokens/second per accelerator versus the A100. Read the full analysis and learn how our findings underscore the need for alternatives in compute solutions here:
Gaudi 2 Achieves 28% Faster Inference Speed Than A100
By
–