datasets that IBM used to train its granite.13b LLM: Arxiv, Common Crawl, DeepMind Mathematics, Free Law, GitHub Clean, Hacker News, OpenWeb Text, Project Gutenberg, Pubmed Central, SEC filings, Stack Exchange, USPTO, Webhose, Wikimedia https://
ibm.com/downloads/cas/
X9W4O6BM
…
IBM Granite 13B LLM Training Datasets Revealed
By
–
