Load balancing, caching, dynamic batching… with 10% DAU you probably don’t need much more than 1k H100s to serve ChatGPT to 100 million customers.
Serving ChatGPT to 100M users with 1k H100s efficiently
By
–
Global AI News Aggregator
By
–
Load balancing, caching, dynamic batching… with 10% DAU you probably don’t need much more than 1k H100s to serve ChatGPT to 100 million customers.
Leave a Reply