they don't maximize rewards, they are given a prompt (a kind of inception) and continue the sequence
Language Models Continue Sequences from Prompts, Not Maximize Rewards
By
–
Global AI News Aggregator
By
–
they don't maximize rewards, they are given a prompt (a kind of inception) and continue the sequence
Leave a Reply