How research works:
– John Schulman did his PhD on reinforcement learning for robotics.
– Then he went to OpenAI and applied it to GPT-3, giving us ChatGPT.
– Then other researchers found there's no need for RL, because you can directly optimize chatbots to please their users.
So
How Research Evolves: From RL Robotics to Direct Chatbot Optimization
By
–
Leave a Reply