Learn how to 2x LLM inference speeds with speculative decoding! We're introducing Medusa in our next release so you can accelerate inference by fine-tune open-source LLMs, whether or not you have labeled data. Join us on May 23rd at 10am PT! https://
pbase.ai/4ajjIxR
Double LLM Inference Speed with Medusa Speculative Decoding
By
–
Leave a Reply