AI Dynamics

Global AI News Aggregator

Training Materials Using Project Gutenberg Public Domain Corpus

Yes! The bonus materials include training on the Project Gutenberg public domain book corpus. I don’t want to go beyond that though and curate other datasets because of copyright concerns. However, you could eg use the FineWeb dataset which is available from hugging face.

→ View original post on X — @rasbt,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *