Quest 2 • Lesson 1

🧠 Reinforcement Learning Basics

Learn how agents learn through trial and error by interacting with an environment.

Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment. The agent receives rewards or penalties for its actions and learns to maximise the total reward over time.

"RL is like training a dog with treats – you reward good behaviour and discourage bad behaviour. Over time, the dog learns what actions lead to treats."

🎯 Key RL Concepts

Agent – the learner/decision maker.
Environment – the world the agent interacts with.
State – the current situation of the agent in the environment.
Action – what the agent can do (move up/down/left/right).
Reward – feedback signal (positive or negative).
Episode – one complete sequence of states, actions, and rewards.

🚀 Interactive Grid Demo

Control the agent (yellow) to reach the goal (green) while avoiding obstacles (red). Every step costs -1, reaching the goal gives +10.

Steps: 0 | Total Reward: 0

📘 How the demo works

The grid is a 4×4 environment. You (the agent) start at the top‑left (blue). The goal is the green cell. Red cells are obstacles – landing on them gives -5 reward. Moving into a wall is not allowed. Your goal is to reach the goal with the fewest steps.

q_learning.py

# Q-learning update rule
Q[state][action] += lr * (reward + gamma * max(Q[next_state]) - Q[state][action])

# Agent selects action (exploration vs exploitation)
if random.random() < epsilon:
action = random.choice(actions) # explore
else:
action = argmax(Q[state]) # exploit

✨ Challenge: Find the Shortest Path

Reset the grid and try to reach the goal in the fewest steps. What's the minimum steps you can achieve? (Hint: obstacles block the direct path.)

➡️ Next Lesson

Lesson 2.2: Generative AI (GANs) – create new images with AI.

Continue to Lesson 2.2 →

(Coming soon – check back or buy Pro Pack for early access)

❤️ Support Free Education

This course is 100% free. If it helps you, consider buying me a coffee.

☕ Buy Me a Coffee

← Back to AI Course Hub

Master Code On The Go