Basics of Probability for Machine Learning

Sep 25, 2024

Introduction to Machine Learning Tutorial - Basics of Probability Theory

Instructor Introduction

  • Name: Priyatosh
  • Role: Teaching Assistant for the course

Objectives of the Tutorial

  • High-level overview of probability theory concepts
  • Not an in-depth teaching, but a refresher for those familiar
  • Encourage review of introductory materials for unfamiliar concepts

Key Concepts

Sample Space

  • Definition: Set of all possible outcomes of an experiment, denoted by ( \Omega \).
  • Elementary Outcomes: Individual elements of the sample space, denoted by lower-case ( \omega \).
  • Examples:
    1. Rolling a Die: Sample space = {1, 2, 3, 4, 5, 6} (finite)
    2. Tossing a Coin Until Condition is Met: Sample space is sequences of H's and T's (countably infinite)
    3. Measuring Speed of a Vehicle: Sample space = real numbers (uncountable)

Events

  • Definition: Any collection of possible outcomes (subset of sample space).
  • Importance: Focus is often on events rather than elementary outcomes (e.g., odd/even outcomes of a die roll).

Basic Set Theory Notation

  • Capital letters indicate sets, small letters indicate elements.
  • Subset Relation: A is a subset of B if every element in A is also in B.
  • Union: Set containing elements of both A and B.
  • Intersection: Set containing only common elements of A and B.
  • Complement: Set containing all elements in the universal set except the elements in A.

Properties of Set Operations

  • Commutativity, Associativity, Distributivity
  • De Morgan's Laws:
    • ( (A \cup B)' = A' \cap B' )
    • ( (A \cap B)' = A' \cup B' )

Disjoint Events

  • Definition: Events A and B are disjoint if ( A \cap B = \emptyset ).
  • Pairwise Disjoint Events: A sequence of events ( A_1, A_2, A_3, \ldots ) are pairwise disjoint if ( A_i \cap A_j = \emptyset ) for all ( i eq j ).
  • Partition of Sample Space: If pairwise disjoint events cover the sample space, they form a partition.

Sigma Algebra

  • Definition: A collection F of subsets of sample space with properties:
    1. Null set is in F.
    2. If A is in F, then A' is also in F.
    3. Countable unions of sets in F are also in F.
  • Measurable Sets: Sets in F are called F-measurable.
  • Importance: Power set is always a sigma algebra, but probabilities cannot be assigned to every subset when the sample space is uncountable.

Probability Measure and Probability Space

  • Probability Measure P: Function from sigma algebra F to [0, 1] satisfying:
    1. P(null set) = 0
    2. P(Ω) = 1
    3. For disjoint sets A1, A2,..., P(( \bigcup A_i \big) = \sum P(A_i) )
  • Probability Space: Triple (Ω, F, P) that provides the framework for probability problems.

Estimating Probability Values

  • Bonferroni's Inequality: Gives a lower bound on the intersection probability:
    ( P(A \cap B) \geq P(A) + P(B) - 1 )
  • Boole's Inequality: Upper bound for the union of events.
    ( P(A_1 \cup A_2 \cup \ldots) \leq P(A_1) + P(A_2) + \ldots )

Conditional Probability

  • Definition: ( P(A|B) = \frac{P(A \cap B)}{P(B)} ) (if P(B) > 0)
  • Importance: Helps update beliefs or predictions based on observed events.

Bayes' Theorem

  • Formula: ( P(A|B) = \frac{P(B|A) \times P(A)}{P(B)} )
  • Application: Useful for computing conditional probabilities based on inverse probabilities.

Independence of Events

  • Definition: Events A and B are independent if ( P(A \cap B) = P(A) \times P(B) ).
  • Conditional Independence: Events A and B are conditionally independent given C if ( P(A \cap B|C) = P(A|C) \times P(B|C) ).

Conclusion

  • Understanding these basics of probability theory is crucial for the course in machine learning.
  • Encourage review and practice with these concepts.