🧩

Exploring Monte Carlo Tree Search Techniques

May 20, 2025

AI 101: Monte Carlo Tree Search (MCTS)

Introduction

Speaker: Tommy Thompson
Topic: Overview of Monte Carlo Tree Search (MCTS) algorithm
Applications: Video game AI, expert computer players

Overview of MCTS

MCTS is a heuristic-driven search algorithm.
Combines classic tree search with reinforcement learning principles.

Tree Search Algorithms

Connect potential states of a problem via actions.
Classic algorithms:
- Uninformed algorithms (e.g., breadth-first, depth-first search): Follow strict ordering of states.
- Intelligent algorithms (e.g., A*): Use action costs and heuristics to estimate state value.*

Reinforcement Learning

Aims to learn optimal strategies through repeated action assessment.
Exploration vs. Exploitation Trade-off:
- Exploit known best actions but continue exploring alternatives.
- Essential to discover potentially better strategies over time.

MCTS Mechanics

Merges tree search and reinforcement learning.
Uses a tree model to find the best path by evaluating subsections of the tree.
Updates state values based on outcomes of actions.

Monte Carlo Method

A sampling algorithm that randomly assesses problem spaces for better answers.
Similar to Minimax algorithm but more efficient in large problem spaces due to high branching factors.

Advantages of MCTS

Searches only a few layers deep, prioritizing parts of the tree for exploration.
Simulates outcomes rather than exhaustively searching.
Adaptable to high branching factors, isolating relevant tree sections.

Implementation of MCTS

Relies on a forward model to predict game outcomes based on actions.
Uses four key steps:
1. Selection: Navigate to a future state in the tree.
2. Expansion: Expose a new state unless a terminal state is reached.
3. Simulation: Perform a random playout until a terminal state is reached.
4. Backpropagation: Update values of all states leading to the terminal state.

Performance of MCTS

Requires hundreds to thousands of evaluations for accuracy.
Balances exploration of different tree parts using different MCTS algorithms (e.g., UCT - Upper Confidence bounds applied to Trees).

Benefits of MCTS

Considered an anytime algorithm: Provides answers regardless of the time or resources available.
Notable implementation: AlphaGo by Google DeepMind.

Applications

Used in various projects, including:
- Fable Legends (canceled)
- Total War franchise
- Research projects: Ms. Pac-Man, Magic: The Gathering, general video game AI competitions.

Conclusion

MCTS is increasingly popular in AI for games and general intelligence.
Supporting resources: Patreon contributors help fund video production.

Full transcript