Reinforcement Learning (Q-Learning)

Watch an agent learn to navigate a grid world through trial and error. Rewards and penalties guide the agent to discover optimal paths (policies).

Speed: 200ms

FastSlow

Learning Rate (alpha): 0.5

Metrics

Episodes: 0

Epsilon: 1.000

The agent learns to navigate to the Goal (Green) while avoiding the Pit (Red) and Obstacles (Black). Arrows indicate the learned policy (optimal direction) for each cell.

Go Deeper

Learn Reinforcement Learning on DataCamp

Curated courses and career tracks to take your understanding from this demo to real-world mastery. All links open directly on DataCamp.

Course

Reinforcement Learning with Gymnasium in Python

Build RL agents using the Gymnasium library. Learn Q-learning, policy gradients, and reward shaping to solve classic control problems.

Advanced4 hoursStart now

Course

Deep Reinforcement Learning in Python

Combine deep learning and RL with Deep Q-Networks (DQN), PPO, and Actor-Critic algorithms for complex environments.

Advanced4 hoursStart now

Course

Reinforcement Learning from Human Feedback (RLHF)

Learn how RLHF is used to align LLMs like ChatGPT. Understand reward modeling, proximal policy optimization (PPO), and fine-tuning.

Advanced3 hoursStart now

Course

Introduction to Deep Learning with PyTorch

Build the neural network foundation needed before implementing Deep Q-Networks and policy gradient methods.

Intermediate4 hoursStart now

Course

Developing AI Systems with the OpenAI API

Learn how modern RL-trained AI systems are deployed via APIs. Understand prompt engineering and model alignment.

Intermediate3 hoursStart now

Career/Skill Track

Reinforcement Learning in Python

Master the full reinforcement learning stack—from Q-tables and temporal-difference learning to deep RL and RLHF.

Advanced20 hoursStart now

Browse all Reinforcement Learning courses

Reinforcement Learning (Q-Learning)

Learn Reinforcement Learning on DataCamp

Reinforcement Learning with Gymnasium in Python

Deep Reinforcement Learning in Python

Reinforcement Learning from Human Feedback (RLHF)

Introduction to Deep Learning with PyTorch

Developing AI Systems with the OpenAI API

Reinforcement Learning in Python

Cirby AI