'In Deep RL methods if we set the Learning rate to 1 then what happens?
If we set the learning rate to 1 in RL models
- The updating process will be very slow
- The agent will always get the low reward
- The new q-value will always be the same
- The agent wont’t consider the previous experience in calculating the Q-value for a given state -action pair
Solution 1:[1]
Answer: 4, if the learning rate is 1, the estimate for the Q-value for a given state-action pair would be the straight up newly calculated Q-value and would not consider previous Q-values that had been calculated for the given state-action pair at previous time steps
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | anonymous |
