AI Dynamics

Global AI News Aggregator

Parallel Dataset Embedding Computing with HuggingFace and DDP

most useful bit of code I've written all year: call map() on a HuggingFace dataset in torch distributed mode (like DDP) as one example, this will let you compute embeddings for a dataset in parallel, using all the GPUs you have http://
gist.github.com/jxmorris12/69a
730fee174f5309968e984c298f8f2

→ View original post on X — @jxmnop,

Commentaires

Leave a Reply

Your email address will not be published. Required fields are marked *