Reinforcement Learning (RL) is a type of machine learning where an agent learns to make decisions by performing actions in an environment to achieve a goal. This page provides an overview of the fundamentals of reinforcement learning.

Key Concepts

  • Agent: The decision-making entity in an environment.
  • Environment: The context in which the agent operates.
  • State: The current situation or context of the environment.
  • Action: A decision made by the agent.
  • Reward: The outcome of an action, which can be positive or negative.

Types of RL

  • Tabular RL: The state-action space is discrete and finite.
  • Continuous RL: The state-action space is continuous.
  • Model-based RL: The agent has a model of the environment.
  • Model-free RL: The agent learns directly from the environment without a model.

Common Algorithms

  • Q-Learning: An iterative method that learns a Q-function, which maps state-action pairs to expected rewards.
  • Sarsa: A state-action-reward-state-action (SARSA) algorithm that updates the Q-value based on the current state, action, and reward.
  • Deep Q-Network (DQN): A combination of Q-learning and deep learning that allows for the learning of complex policies.

Learning Process

  1. The agent selects an action based on the current state.
  2. The environment transitions to a new state and provides a reward.
  3. The agent updates its Q-value based on the reward and the new state.
  4. Repeat steps 1-3 until the agent reaches a terminal state or a desired outcome.

Resources

For more information on reinforcement learning, you can visit our Reinforcement Learning Tutorial.

Reinforcement Learning Diagram