Thanks — I always half expect someone to pop up say “we just cloned this other repo and ran it for a 50% performance improvement”. With RL it is very hard to know what “good performance” is.
Evaluating RL Performance: Challenges in Benchmarking Improvements
By
–