Distributed training across thousands of traditional accelerators suffers diminishing returns as more compute is added. Join Natalia Vassilieva at #SC22 as she presents the Cerebras Wafer-Scale Cluster, which achieves near-perfect linear scaling across additional cores
#AI #SC22
Cerebras Wafer-Scale Cluster Achieves Linear Scaling Performance
By
–
