Skip to content

JS2498/CS420-Reinforcement-Learning

Repository files navigation

CS420: Reinforcement Learning

This repository contains our solutions to the assignment problems of the course "CS420/414 : Reinforcement Learning" offered by Dr. Prabuchandran K J at IIT Dharwad

  • Assignment 2 : Bandit Algorithms

    • Implemented epsilon-greedy, variable epsilon-greedy, Softmax, Upper Confidence Bound (UCB) and Thompson sampling algorithms for Bernoulli and Normal reward setting.
  • Assignment 3 : Value Based Methods

    • A classical maze problem was considered and policy iteration and value iteration were used to solve the problem.
  • Assignment 4 : Sample Based Monte-Carlo and Temporal Difference Methods

    • Implemented Every Visit Monte-Carlo, Q-learning and SARSA agents for classical maze and Mountain Car environment.
  • Assignment 5 : Temporal Difference methods with function approximation and Reinforce algorithm.

    • Implemented Q-learning, SARSA with Tile Coding and Radial basis function approximation methods, and Reinforce with and without baseline for Cart Pole and Mountain Car environment.
  • Mini Project : Policy Gradient Algorithms for Atari games

    • Trained Ray rllib A2C, A3C and PPO agents for Pong, Breakout and Space Invaders atari environments and compared their results along with expalination of each algorithm in the report.