Narrative Violation—DeepSeek V3 is a competitive, but NOT top model. SEAL leaderboards have been updated with DeepSeek V3 (Mar 2025). – 8th on Humanity’s Last Exam (text-only).
– 12th on MultiChallenge (multi-turn). View the full rankings: http://
scale.com/leaderboard
DeepSeek V3 Ranked 8th and 12th on SEAL Leaderboards
By
–
Leave a Reply