🧩

Exploring Monte Carlo Tree Search Techniques

May 20, 2025

AI 101: Monte Carlo Tree Search (MCTS)

Introduction

  • Speaker: Tommy Thompson
  • Topic: Overview of Monte Carlo Tree Search (MCTS) algorithm
  • Applications: Video game AI, expert computer players

Overview of MCTS

  • MCTS is a heuristic-driven search algorithm.
  • Combines classic tree search with reinforcement learning principles.

Tree Search Algorithms

  • Connect potential states of a problem via actions.
  • Classic algorithms:
    • Uninformed algorithms (e.g., breadth-first, depth-first search): Follow strict ordering of states.
    • Intelligent algorithms (e.g., A*): Use action costs and heuristics to estimate state value.*

Reinforcement Learning

  • Aims to learn optimal strategies through repeated action assessment.
  • Exploration vs. Exploitation Trade-off:
    • Exploit known best actions but continue exploring alternatives.
    • Essential to discover potentially better strategies over time.

MCTS Mechanics

  • Merges tree search and reinforcement learning.
  • Uses a tree model to find the best path by evaluating subsections of the tree.
  • Updates state values based on outcomes of actions.

Monte Carlo Method

  • A sampling algorithm that randomly assesses problem spaces for better answers.
  • Similar to Minimax algorithm but more efficient in large problem spaces due to high branching factors.

Advantages of MCTS

  • Searches only a few layers deep, prioritizing parts of the tree for exploration.
  • Simulates outcomes rather than exhaustively searching.
  • Adaptable to high branching factors, isolating relevant tree sections.

Implementation of MCTS

  • Relies on a forward model to predict game outcomes based on actions.
  • Uses four key steps:
    1. Selection: Navigate to a future state in the tree.
    2. Expansion: Expose a new state unless a terminal state is reached.
    3. Simulation: Perform a random playout until a terminal state is reached.
    4. Backpropagation: Update values of all states leading to the terminal state.

Performance of MCTS

  • Requires hundreds to thousands of evaluations for accuracy.
  • Balances exploration of different tree parts using different MCTS algorithms (e.g., UCT - Upper Confidence bounds applied to Trees).

Benefits of MCTS

  • Considered an anytime algorithm: Provides answers regardless of the time or resources available.
  • Notable implementation: AlphaGo by Google DeepMind.

Applications

  • Used in various projects, including:
    • Fable Legends (canceled)
    • Total War franchise
    • Research projects: Ms. Pac-Man, Magic: The Gathering, general video game AI competitions.

Conclusion

  • MCTS is increasingly popular in AI for games and general intelligence.
  • Supporting resources: Patreon contributors help fund video production.