Reinforcement Learning (RL) Project

Project Description

We employ the Actor-Critic, Reinforce with Baseline, and Episodic n-step SARSA algorithms to acquire an optimal policy for distinct Markov Decision Processes (MDPs), specifically, MountainCar-v0, Acrobot-v0, and CarPole-v1 from the OpenAI Gym library. Systematic experimentation has been conducted on both hyperparameters and model architecture, leading in the presentation of results for the most effective configuration.

Technical Skills

Dependencies

OpenAI Gym

  !pip install gym

PyTorch (Check CPU/GPU Compatibility)

  https://pytorch.org/get-started/locally/

NumPy

  !pip install numpy

Matplotlib

  !pip install matplotlib

File Contents

Actor Critic Final.py
- Contains the implementation of the Actor-Critic algorithm, a reinforcement learning technique combining policy (Actor) and value function (Critic) approximation to enhance learning efficiency.
REINFORCE with Baseline Final.py:
- Encompasses the implementation of the REINFORCE algorithm with Baseline, a policy gradient method incorporating a baseline to reduce variance in gradient estimates.
Semi-Gradient-SARSA Final.py
- Houses the implementation of the Semi-Gradient-SARSA algorithm, a temporal difference learning method applied in reinforcement learning scenarios for updating Q-values and optimizing policy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Reinforcement Learning (RL) Project

Project Description

Technical Skills

Dependencies

OpenAI Gym

PyTorch (Check CPU/GPU Compatibility)

NumPy

Matplotlib

File Contents

Files

README.md

Latest commit

History

README.md

File metadata and controls

Reinforcement Learning (RL) Project

Project Description

Technical Skills

Dependencies

OpenAI Gym

PyTorch (Check CPU/GPU Compatibility)

NumPy

Matplotlib

File Contents