– conditioning on actions is what makes it a world model
– planning action sequences is what makes it useful.
– predicting in representation space trained in a completely self-supervised, task-independent manner is what makes it complicated.
– training the encoder and predictor
World Models: Action Conditioning and Representation Learning
By
–