We’ve reduced the model latency by 4-5x, serving results on average in 0.65 seconds instead of 3.15 seconds (FT-GPT-3.5 compared to GPT-4). You may notice the speedup when Copilot prompts you for user input. Every second counts, and we’re here to make them all productive. pic.twitter.com/dGTu8aYtXw
— Perplexity (@perplexity_ai) 25 août 2023
We’ve reduced the model latency by 4-5x, serving results on average in 0.65 seconds instead of 3.15 seconds (FT-GPT-3.5 compared to GPT-4). You may notice the speedup when Copilot prompts you for user input. Every second counts, and we’re here to make them all productive.
Leave a Reply