Category "stable-baselines"

RL + optimization: how to do it better?

I am learning about how to optimize using reinforcement learning. I have chosen the problem of maximum matching in a bipartite graph as I can easily compute the

LSTM based policy in stable baselines3 model

I am trying to make a PPO model using the stable-baselines3 library. I want to use a policy network with an LSTM layer in it. However, I can't find such a possi

Why does ep_rew_mean decrease over time?

In order to learn about reinforcement learning for optimization I have written some code to try to find the maximum cardinality matching in a graph. Not only d