Credit assignment is hard. I wonder how many LLM papers use multi-step RL? Tool use is the thing that comes to mind. It would be great if someone working on this could comment. Also, how many people out there are doing multi-step RL with LLMs?
Credit Assignment and Multi-Step RL in LLM Research
By
–
Leave a Reply