Are modern recommendation systems treated like a reinforcement learning problem, with a sum of discounted future rewards, or as strictly single step transactions? Many products do significant offline data analysis on actions taken to inform changes, but it seems under
Modern Recommendation Systems: Reinforcement Learning vs Single-Step Transactions
By
–
Leave a Reply