Paper: TPUv4 system has an optically reconfigurable network to assemble groups of 4x4x4 chips like legos (4x4x12? 16x16x16?). SparseCores help w/ embeddings. TPUv4 outperforms TPUv3 by 2.1x & perf/W by 2.7x, & has 4096 chips so ~10x faster overall.
.
Google TPUv4 Optical Network 4096 Chips 10x Performance
By
–
Leave a Reply