applying large-scale RL to chains of cope written by human seethers, we observe an “lmao moment” upon which the model spontaneously exhibits superhuman snark
RL Model Achieves Superhuman Snark on Human Cope Chains
By
–
By
–
applying large-scale RL to chains of cope written by human seethers, we observe an “lmao moment” upon which the model spontaneously exhibits superhuman snark