'Reinforcement learning (q-learning) evaluation

I am new to reinforcement learning, and currently I am working on a small q-learning project but I am a little confused? 1- what is the testing phase of a q-learning model, and how do we make a prediction (try it on single, unseen data) with it? at this point I have created the needed function for choosing action and getting reward ...etc, and I was able to run a 10000 episodes, but I believe this is the training phase. 2- what are the metrics that we use to say that our model has learned and performed well or not? something like accuracy in classification scheme for example.

Thank you.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source