Skip to content Skip to sidebar Skip to footer

Modern Reinforcement Learning: Actor-Critic Algorithms

 

Modern Reinforcement Learning: Actor-Critic Algorithms

In this advanced course on deep reinforcement learning, you will learn how to implement policy gradient, actor critic, deep deterministic policy gradient 

What you'll learn

  • How to code policy gradient methods in PyTorch
  • How to code Deep Deterministic Policy Gradients (DDPG) in PyTorch
  • How to code Twin Delayed Deep Deterministic Policy Gradients (TD3) in PyTorch
  • How to code actor critic algorithms in PyTorch
  • How to implement cutting edge artificial intelligence research papers in Python

About The Course Modern Reinforcement Learning: Actor-Critic Algorithms

In this advanced course on deep reinforcement learning, you will learn how to implement policy gradient, actor critic, deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3), and soft actor critic (SAC) algorithms in a variety of challenging environments from the Open AI gym. There will be a strong focus on dealing with environments with continuous action spaces, which is of particular interest for those looking to do research into robotic control with deep reinforcement learning.

Rather than being a course that spoon feeds the student, here you are going to learn to read deep reinforcement learning research papers on your own, and implement them from scratch. You will learn a repeatable framework for quickly implementing the algorithms in advanced research papers. Mastering the content in this course will be a quantum leap in your capabilities as an artificial intelligence engineer, and will put you in a league of your own among students who are reliant on others to break down complex ideas for them.

Fear not, if it's been a while since your last reinforcement learning course, we will begin with a briskly paced review of core topics.

The course begins with a practical review of the fundamentals of reinforcement learning, including topics such as:

  • The Bellman Equation
  • Markov Decision Processes
  • Monte Carlo Prediction
  • Monte Carlo Control
  • Temporal Difference Prediction TD(0)
  • Temporal Difference Control with Q Learning
And moves straight into coding up our first agent: a blackjack playing artificial intelligence. From there we will progress to teaching an agent to balance the cart pole using Q learning.

After mastering the fundamentals, the pace quickens, and we move straight into an introduction to policy gradient methods. We cover the REINFORCE algorithm, and use it to teach an artificial intelligence to land on the moon in the lunar lander environment from the Open AI gym. Next we progress to coding up the one step actor critic algorithm, to again beat the lunar lander.

With the fundamentals out of the way, we move on to our harder projects: implementing deep reinforcement learning research papers. We will start with Deep Deterministic Policy Gradients (DDPG), which is an algorithm for teaching robots to excel at a variety of continuous control tasks. DDPG combines many of the advances of Deep Q Learning with traditional actor critic methods to achieve state of the art results in environments with continuous action spaces.

Go Link

Online Course CoupoNED based Analytics Education Company and aims at Bringing Together the analytics companies and interested Learners.