This is tricky: To do this, we’ll need ways for the human who’s supervising the model to use any relevant knowledge or skills that the model already has, even though they can’t trust the model to be reliably helpful.
Leveraging Model Capabilities While Maintaining Supervisory Control
By
–
Leave a Reply