AI Dynamics

Global AI News Aggregator

About

DoReMi: Domain-Weighted Resampling for Efficient Model Training

7/ DoReMi – trains a small proxy model over domains to produce domain weights without knowledge of downstream tasks; it resamples a dataset with the domain weights which allows using a 280M proxy model to train an 8B model (30x larger) more efficiently.

→ View original post on X — @dair_ai,