We are seeing a lot of models pop up in the 3B and 7B param range, incl RedPajamas-INCITE 3B from Together that is based on the RedPajamas dataset. That dataset is designed to replicate the LLaMA dataset. I've heard Meta is also exploring 3B as an option for its fully OSS model.
Small Language Models: 3B and 7B Parameters Emerge in Open Source
By
–
Leave a Reply