From what I gather, LaMDA is not per se using RLHF with capital letters as OpenAI has tersely discussed, but using some (other) sort of reinforcement learning with human feedback of their own that has not been disclosed.
LaMDA reinforcement learning approach differs from OpenAI RLHF
By
–
Leave a Reply