This program implemented CNN and Q Learning strategies for predicting the best left/right move for gym API CartPole-v1, and the goal is to achieve 200 frames before the pole fall down.
For CNN, I first generated a random dataset, then built a neural network with 6 Dense layers and 4 dropout layers, with a total of 76,552 parameters.
Both CNN and Q Learning strategies were able to achieve the goal of 200 frames before the pole fall down.
The Jupyter nootbook was compiled on Anaconda, Python 3.8, and Tensorflow 2.4.