It doesn't need to be representing new physics, for us to have trouble understanding what the giant inscrutable tensors mean — or writing airtight Goodhart-proof objectives that operate over them, a much higher requirement of proficiency than current interpretability efforts.
Understanding Giant Tensors and Writing Goodhart-Proof Objectives
By
–
Leave a Reply