Adding a little bit of SIGReg to prefinal activations did coerce them into independent Gaussian, but it hurt generalization on value functions. Training a full LeWM ahead of time also resulted in worse value function estimation. I’m not giving up yet, but my first few attempts
SIGReg and LeWM Experiments: Challenges with Value Function Generalization
By
–