'Q-learning vs temporal-difference vs model-based reinforcement learning
I'm in a course called "Intelligent Machines" at the university. We were introduced with 3 methods of reinforced learning, and with those we were given the intuition of when to use them, and I quote:
- Q-Learning - Best when MDP can't be solved.
- Temporal Difference Learning - best when MDP is known or can be learned but can't be solved.
- Model-based - best when MDP can't be learned.
Are there any good examples explaining when to choose one method over the other?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
