🤖

Game Theory, AI, and Kidney Exchange

Nov 26, 2025

Overview

Interview with Tuomas Sandholm on game theory, AI breakthroughs in poker, defense, and kidney exchange, plus views on LLMs, ethics, and entrepreneurship.

Guest Background

  • Finnish-American computer scientist; TKK/Aalto graduate (1991); PhD in USA; professor and entrepreneur.
  • Recognized by Goldman Sachs among top 100 entrepreneurs; winner of Alfred Kordelin Prize.
  • H-index 98: at least 98 papers each cited ≥98 times.

Game Theory: Core Ideas

  • Studies strategic decision-making with multiple players where outcomes depend on others’ strategies.
  • Solution concepts (e.g., Nash equilibrium): no player benefits by unilateral deviation.
  • Perfect information games: all states visible (chess, go). Imperfect information games: hidden information (poker).
  • Sandholm’s vision: solve large, realistic games computationally, not by hand.

Poker AI Breakthroughs

  • Began poker research in 2004; early losses, steady algorithmic improvement.
  • 2015: Claudico lost to pros; key learnings.
  • 2017: Libratus decisively beat four top-10 heads-up No-Limit Texas Hold’em pros.
  • 2019: Pluribus achieved superhuman performance in six-player No-Limit Hold’em.

Why It Mattered

  • Imperfect-information game benchmark solved at superhuman level.
  • Real-world applications (defense, business) are mostly imperfect-information games.

Key Algorithmic Innovations

  • Real-time reasoning for imperfect-information games: compute situational strategy while maintaining global balance.
  • Middlegame reasoning for scalability in multi-player settings.
  • Approximate equilibria computed; full game size is astronomical (~10^161 states for 2-player, 200BB).

Human Strategy Insights from Bots

  • Overbets and small underbets gained prominence.
  • Bot behaviors contradicting poker books:
    • Limping (8–12% in heads-up) used effectively with proper follow-up strategy.
    • Donk betting used in specific, probabilistic contexts.

Large Language Models (LLMs) vs Game-Theoretic AI

  • Poker/strategy AIs optimize directly from rules; no human data training; produce novel strategies.
  • LLMs: token-by-token text prediction; issues include hallucinations, data scarcity, self-training on outputs.
  • Strategic reasoning relies on game trees and information sets; fundamentally different from word sequences.
  • LLMs useful, but less reliable for strategy; risk of degrading users’ own skills with overreliance.

Kidney Exchange: AI Saving Lives

  • Problem: Living donor pairs often incompatible; deceased donor supply insufficient.
  • Solution: National kidney exchange optimizes cycles (rings) and chains.
  • Chains from altruistic donors (no return kidney) enable ongoing matches; first chain run 2006; widely adopted by 2011.
  • Since 2010, Sandholm’s algorithms have run the US national exchange weekly; 80% of centers participate.
  • Result: Over 10,000 life-saving transplants enabled via chains worldwide.

Optimization and Incentives

  • Finding optimal combination of rings and chains is NP-hard; algorithms compute provably optimal matchings.
  • Hospitals often “cheat” by doing local matches first; all centers exploited this when possible.
  • Proposed credit mechanisms to improve incentives (conceptual work).

Strategy Robot (Company)

  • Products: software based on computational game theory for defense applications; bootstrapped and profitable.
  • Use cases across tactical, operational, and strategic levels:
    • Fighter tactics in sensor-driven, imperfect-information environments.
    • Carrier strike group and drone swarm employment.
    • Strategic stability: nuclear deterrence, escalation/de-escalation modeling.
  • Not autonomous physical robots; “robot” is metaphor for strategy generation.
  • Collaboration with US DoD; details limited to public disclosures.

Ethics and Life-or-Death Decisions

  • Kidney exchange: algorithms decide weekly who gets organs; humans set values, algorithms optimize policy.
  • Ends vs means distinction:
    • Ends (value judgments) are human: maximize number of transplants, added life-years, prioritize highly sensitized patients, etc.
    • Means (policy optimization) are computational: map value goals to optimal allocations.
  • Defense work follows existing DoD policies; air-to-air contexts avoid civilian dilemmas.

Combinatorial Auctions and Mechanism Design

  • Founded CombineNet; ran large-scale combinatorial procurement auctions (e.g., P&G trucking).
  • Innovations:
    • Expressive bidding languages and fast optimization for NP-hard allocation problems.
    • Automated mechanism design insights; some optimal designs inherently exponential in size.
  • Practical example: capturing known cost advantages (e.g., Korean aluminum suppliers) via tailored mechanisms to reduce buyer costs.

Views on AI, Society, and Art

  • Future of work: high-skill professions disrupted; specialists remain valuable longer than generalists.
  • Art: AI-human hybrids will produce compelling works; competitions may need human-only categories.
  • Risk: cognitive decline from overuse of generative tools; observed classroom performance gaps between homework (with tools) and exams.

Media, Public Reaction, and Security

  • Poker community backlash due to livelihood threats and bot proliferation on sites (not by Sandholm’s team).
  • US law changes and bot risks reduced online poker profitability.
  • Social media hostility experienced; no threats reported.

Investments and Companies

  • Entrepreneurial track: CombineNet (sold 2010), several smaller firms, Strategy Robot.
  • Investing approach: buy/hold, focus on understood businesses; example: AppLovin high appreciation.
  • Positive on Google for AI talent and valuation (not investment advice).

Key Terms & Definitions

  • H-index: researcher has H papers each cited ≥H times (impact metric).
  • Nash equilibrium: strategy profile where no player benefits by unilateral deviation.
  • Imperfect-information game: players have private information (e.g., poker hands).
  • Real-time reasoning: computing situational strategies while preserving global equilibrium constraints.
  • NP-hard: class of problems that are computationally intractable to solve exactly at scale.
  • Altruistic donor: a donor who gives without a designated recipient, enabling chains.

Structured Highlights

TopicProblemApproachResult/Impact
Poker AI (Heads-up)Imperfect information; massive state spaceReal-time reasoning; approximate equilibriumLibratus beat top pros (2017)
Poker AI (Multi-player)Larger game trees; multiple equilibriaMiddlegame reasoning; scalable algorithmsPluribus superhuman (2019)
Kidney ExchangeIncompatibility; low deceased donor supplyOptimal rings and chains; altruistic chain innovationNational weekly optimization; 10,000+ transplants via chains
Defense TacticsSensor-based, hidden-state combatComputational game theory for tactics/strategyBest-in-class fighter tactics; strategic stability modeling
Combinatorial ProcurementComplex supplier constraintsExpressive bids + NP-hard optimizationGlobal auctions; significant cost reductions
Mechanism DesignOptimal sales design complexityAutomated mechanism designProofs of exponential-size optimal mechanisms

Action Items / Next Steps

  • For policy makers: specify value objectives (e.g., life-years vs count) explicitly; let algorithms optimize.
  • For defense planners: incorporate computational game-theoretic tools into escalation stability analysis.
  • For researchers: expand real-time reasoning methods to broader imperfect-information domains.
  • For educators: design assessments that sustain human skill development alongside AI tool use.