~~The hungry dragon wants to find his way to the chicken~~

Use the dropdown menu to select a grid size. After selection press *Start new environment*

Adjust agent speed using the corresponding bar.

Epsilon is the chance of random action.

Q-value shows the average Q-value of the current state.

Episode reward is the total reward received during the episode.

The size of each sphere represents the average Q-value of the state.

Red spheres represent negative Q-values, while green ones represent positive ones.

Q-Learning project