Markov Chains Overview

Overview

This lecture traces the history and significance of Markov chains, explaining how they revolutionized probability theory and became foundational in fields like nuclear physics, web search, and predictive text.

The Russian Math Feud and the Law of Large Numbers

Early 20th-century Russian mathematicians Nekrasov and Markov clashed over the meaning and requirements of the law of large numbers.
Nekrasov argued that independence of events is required for the law of large numbers; seeing convergence implies independence and thus free will.
Markov disagreed, aiming to show that dependent events could also obey the law of large numbers.

Markov Chains: Concept and Application

Markov analyzed letter sequences in a Russian poem to demonstrate dependency between letters.
He built a model (now called a Markov chain) to predict the next letter based on the current one using transition probabilities.
Markov chains are “memoryless”: only the current state matters for prediction, not the full history.
He proved that even dependent events (modeled by Markov chains) can satisfy the law of large numbers.

Markov Chains in Nuclear Physics: The Monte Carlo Method

During the Manhattan Project, Ulam and von Neumann used Markov chains to model neutron behavior in nuclear bombs.
They simulated chains of neutron events with probabilities for scattering, absorption, or further fission.
The Monte Carlo method involves running thousands of such chains to statistically estimate complex outcomes.

Markov Chains in Search Engines: PageRank and Google

Google improved web search by modeling webpages and links as a Markov chain (PageRank).
The likelihood of visiting a page depends on the link structure and a random “jump” factor to ensure connectivity.
PageRank measures page quality by the long-term proportion of time a “random surfer” spends on each page.

Markov Chains and Predictive Text

Claude Shannon extended Markov chains to predict text, using previous letters or words to model language structure.
Modern language models use variants of Markov chains with enhancements like attention to improve prediction quality.

Limitations and Power of Markov Chains

Markov chains struggle with feedback loops, where the outcome affects future states in complex ways (e.g., climate systems).
Many systems are simplified by their memoryless property, allowing for practical prediction despite underlying complexity.

Card Shuffling and Markov Chains

Shuffling cards can be modeled as a Markov chain, where each deck arrangement is a state.
Seven riffle shuffles are needed to randomize a standard deck; inefficient shuffling methods may require thousands of shuffles.

Key Terms & Definitions

Law of Large Numbers — The principle that averages of repeated independent trials converge to the expected value as the number of trials increases.
Markov Chain — A model where transitions between states depend only on the current state, not on past history.
Transition Probability — The likelihood of moving from one state to another in a Markov chain.
Monte Carlo Method — A statistical technique using repeated random sampling to estimate complex outcomes.
PageRank — An algorithm that ranks webpages using a Markov chain based on link structure and random jumps.
Memoryless Property — The feature by which the future state depends only on the present state, not on the sequence of events that preceded it.

Action Items / Next Steps

Review how Markov chains differ from independent-event models in probability.
Practice calculating transition probabilities in simple Markov chains.
Explore exercises or readings on Monte Carlo simulations and Markov chain applications.
Investigate further how PageRank works and its impact on web search.