So, you have a 200 TB/s fire hose of recycling information, which you would have to “pump up” over many cycles from much slower conventional memory, but it could then feed straight in to a field of tensor core registers for real time model inference. Fun to think about!
High-speed tensor core inference with massive memory bandwidth
By
–
Leave a Reply