Training a LLaMA 3 8B that supports 16K tokens, twice the current context window. If successful, will open-source it.
LLaMA 3 8B Extended to 16K Token Context Window
By
–
By
–
Training a LLaMA 3 8B that supports 16K tokens, twice the current context window. If successful, will open-source it.