Nice work! In case you get stuck, I've implemented GQA as part of the GPT -> Llama conversion guides: https://
github.com/rasbt/LLMs-fro
m-scratch/tree/main/ch05/07_gpt_to_llama
…
But since you already implemented MQA, GQA should be pretty smooth sailing 🙂
GQA Implementation Guide for Llama Model Conversion
By
–