Catalog

Course

Deep Reinforcement Learning

From RL Foundations to PPO, SAC, and the Spinning Up Toolkit

Intermediate 9h estimated 4 modules

Reinforcement LearningPolicy GradientsDeep LearningPyTorchTensorFlow

Modules Syllabus

Agents, environments, policies, value functions, and the mathematical landscape of modern RL algorithms.

Policy Gradient Algorithms

VPG, TRPO, and PPO — the on-policy family of algorithms that directly optimize the policy objective.

Off-Policy Methods & Tooling

DDPG, TD3, and SAC for continuous control, plus the Spinning Up toolkit for running, logging, and benchmarking experiments.

Visual Reinforcement Learning

Train agents directly from pixels using convolutional policies, frame stacking, and ViZDoom — from a stationary-enemy basic scenario to multi-target arena combat.