We've seen a lot of new preference alignment techniques lately, but this is a really cool one combining model merging and RLHF.
New Preference Alignment Technique Combines Model Merging and RLHF
By
–

By
–

We've seen a lot of new preference alignment techniques lately, but this is a really cool one combining model merging and RLHF.