I want to share some results related to my self-merge experiments: – The 47B merge doesn't work: replicating blocks > layers (apologies for doubting it) – The 52B is successful in BBH and MATH Lvl 5. However, there might be an issue with the original model's MATH eval. This is
Self-Merge Experiments: 52B Model Success in BBH and MATH
By
–
