AI Dynamics

Global AI News Aggregator

Hugging Face Research Pretraining Data Accidentally Leaked Publicly

Oh shit, it seems like all the HF Research team pretraining data has been accidentally leaked to the public. The web, PDFs, and synthetic datasets are expode on hf FineData org… Apparently, an intern used CC to push the data with private=False.

→ View original post on X — @thom_wolf, 2026-03-31 18:47 UTC

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *