The problem is the process ultimately depends on RL ranking generations made for the same prompt. If you need new human responses too, it can’t be automated.
RL ranking automation needs new human responses
By
–
By
–
The problem is the process ultimately depends on RL ranking generations made for the same prompt. If you need new human responses too, it can’t be automated.