Yeah I happen to know how hard it is because I tried to figure it out directly from the paper, and I had to go all the way back to the Palm paper to get the MLP size details — each paper tends to say "our arch is just like except for…"
Challenges in Finding LLM Architecture Details Across Papers
By
–
Leave a Reply