'In Deep RL methods if we set the Learning rate to 1 then what happens?

If we set the learning rate to 1 in RL models

  1. The updating process will be very slow
  2. The agent will always get the low reward
  3. The new q-value will always be the same
  4. The agent wont’t consider the previous experience in calculating the Q-value for a given state -action pair


Solution 1:[1]

Answer: 4, if the learning rate is 1, the estimate for the Q-value for a given state-action pair would be the straight up newly calculated Q-value and would not consider previous Q-values that had been calculated for the given state-action pair at previous time steps

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 anonymous