So cool to see @BigCodeProject release SantaCoder, afaik the very first LLM trained on an "opted-out" dataset, aka allowing people to opt-out from the training dataset. 1.1B parameters that outperforms larger models on both generation and infilling! https://
huggingface.co/bigcode/santac
oder
…
SantaCoder: First LLM with Opted-Out Training Dataset
By
–
Leave a Reply