Reinforcement Learning Algorithms Implementation: From Zero To Hero

What is Reinforcement Learning about?

In contrast to supervised learning where machines learn from examples that include the correct decision and unsupervised learning where machines discover patterns in the data, reinforcement learning allows machines to learn from partial, implicit and delayed feedback. This is particularly useful in sequential decision making tasks where a machine repeatedly interacts with the environment or users. Applications of reinforcement learning include robotic control, autonomous vehicles, game playing, conversational agents, assistive technologies, computational finance, operations research, etc

Disclaimer!

This repository mainly contains my assignments for

this Reinforcement Learning course, which was offered in Fall 2021 at UWaterloo by Professor

Pascal Poupart. Because of the academic integrity, I don't have the permission to post this repository publicly online; therefore, this repository is only accessible upon explicit request to me as defined in

this document.

Download From Github With Explanations [PRIVATE REPO, ONLY ACCESSIBLE BY EXPLICIT REQUEST]

Part 1

Summary:

Markov Decision Process [from scratch in Python]
- value iteration
- policy iteration
- modified policy iteration
Maze problem to test above algorithms
Compare the performance of each algorithm
Q-Learning [from scratch in Python]
Use matplotlib to compare the effect of the Q-Learning parameters on the cumulative discounted rewards per episode
deep Q-network to solve the CartPole problem from Open AI Gym
- Using Agents library from TensorFlow
Use matplotlib to compare the effect of the deep Q-network parameters on the average cumulative discounted rewards [also averaged across several runs to reduce stochasity]
More details:
https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-fall21/assignments.html assignment 1 section

Part 2

Summary:

Bandit algorithms from scratch in Python
- epsilon-greedy
- Thompson sampling
- UCB
REINFORCE algorithm from scratch in Python
model-based RL algorithm from scratch in Python
Soft Q-Learning in Pytorch
Soft Actor Critic in Pytorch
Discussion over the properties of each algorithms and their effect over the performance
More details:
https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-fall21/assignments.html assignment 2 section

Part 3

Partially Observable RL
- Deep Recurrent Q learning (DRQN) algorithm in Pytorch
  - Using LSTM and MLP
  - Compare to Deep Q Network's performance
Generative Adversarial Imitation Learning (GAIL) algorithm in Pytorch
- Using deterministic policy gradient update technique
- Compare to Behavior Cloning's (BC) performance
Categorical (C51) distributional RL algorithm
- Compare to DQN on the Cartpole domain with epsilon greedy exploration
More details:
https://cs.uwaterloo.ca/~ppoupart/teaching/cs885-fall21/assignments.html assignment 3 section

Download From Github With Explanations [PRIVATE REPO, ONLY ACCESSIBLE BY EXPLICIT REQUEST]

مشخصات

جهت مشاهده منبع اصلی و ادامه این مطلب این مطلب کلیک کنید
کلمات کلیدی منبع: learning ,from ,compare ,this ,algorithm ,reinforcement ,reinforcement learning ,from scratch ,more details ,html assignment ,assignments html ,teaching cs885 fall21 ,cs885 fall21 assignments ,~ppoupart teaching cs885 ,more details https
در صورتی که این صفحه دارای محتوای مجرمانه است یا درخواست حذف آن را دارید لطفا گزارش دهید.

Reinforcement Learning Algorithms Implementation: From Zero To Hero

What is Reinforcement Learning about?

Disclaimer!

Part 1

Part 2

Part 3

مشخصات

آخرین مطالب این وبلاگ

آخرین ارسال ها

آخرین وبلاگ ها

برترین جستجو ها

آخرین جستجو ها

درباره این سایت