What if you didn’t have any hard goals..? And got rewards continually? And have stochastic actions? MDPs as Utility-based problem solving agents.
4/3. (FO)MDPs: The plan General model has no initial state; complex cost and reward functions, and finite/infinite/indefinite horizons Standard algorithms.