'Box space always returning float values in OpenAI gym

I'm trying to get started with reinforcement learning using OpenAI Gym. I tried to tackle the Hotter-Colder exercise (https://gym.openai.com/envs/HotterColder-v0/).

For the action space, i am trying to pass a Box space to ensure it is a continuous space. Even though i am specifying the type as int32, when i train the model via model.learn it always gets the values as float32 between 0 and 2.5.

As you can see in the code below, the action_space is specified with a Box as int32 but during the training and prediction phase, the action value is always array[float32]. Furthermore, rather than get values between 1.0 and 100.0, the values seem to be stuck only between 0.0 and 2.5. Anyone know how to solve this?

Thank you very much.

Here is my code:

class HotterColder(Env):
    def __init__(self):
        self.range = 100
        self.guess_max = 100
        
        self.number = 0
        self.guess_count = 0
        self.action_space = Box(low=1,high=100,shape=(1,),dtype=np.int32)
        self.observation_space = Discrete(4)
        self.state = 0
        np.random.seed(0)
        
    def reset(self):        
        self.number = np.random.randint(low=1, high=self.range)
        self.guess_count = 0
        self.observation = 0
        return self.state
    
    def render(self):
        pass

    def step(self, action):                
        guess =int( action[0])
        
        if guess < self.number:
            self.state = 1
        elif guess > self.number:
            self.state = 3
        else:
            self.state = 2
        
        self.guess_count += 1        
        done = self.guess_count >= self.guess_max
        reward = ((min(guess, self.number) + self.range) / (max(guess, self.number) + self.range))**2        
        info = {"guess": guess, "actual": self.number, "guesses": self.guess_count, "reward": reward, "state": self.state}
                
        if done:
            if guess == self.number:                
                print("Correct guess." + str(info))
                
        return self.state, reward, done, info

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Box space always returning float values in OpenAI gym

Sources

Related Questions