Market Basket Analysis and Apriori Algorithm

Jun 30, 2024

Market Basket Analysis and Apriori Algorithm

Introduction

  • Context: Examples of purchasing behaviors in retail stores (e.g., buying iPhones in Target).
  • Goal: Increase revenue by understanding and leveraging customer purchasing behaviors.
  • Technique: Market Basket Analysis - finding associations between items.

Key Concepts

Market Basket Analysis

  • Analyzes the purchasing patterns of customers.
  • Uncovers associations between items frequently bought together.
  • Examples:
    • Customers buying bread and jam.
    • Customers buying laptops and laptop bags.
  • Helps in product placement and targeted marketing.
  • Encourages additional spending with offers (e.g., bread, butter, and eggs).

Association Rule Mining

  • Rule Structure: If (antecedent) -> Then (consequent).
  • Example: If a customer buys A, they are likely to buy B.
  • Antecedent: Item/group of items found in the itemset.
  • Consequent: Item/group of items that come with the antecedent.
  • Goal: Identify co-occurrence patterns within transactions.

Algorithm and Metrics

Single and Multiple Cardinality

  • Relationships between multiple items (single/multiple cardinality).
  • Examples of sequences:
    • A -> B
    • A, B -> C
    • A, B, C -> D

Measures: Support, Confidence, and Lift

Support

  • Frequency of an item/combination of items being bought.
  • Filters out items bought less frequently.

Confidence

  • Likelihood of items occurring together given a specific antecedent.
  • Focuses analysis on frequently bought item combinations.

Lift

  • Strength of rule compared to random occurrences.
  • High value indicates a significant association.

Example of Association Rule Mining

  • Datasets: Items A, B, C, D, E; Transactions T1 to T5.
  • Sample Rules: A -> D, C -> A, B, C -> A.
  • Calculation: Support, Confidence, and Lift for given rules.

Apriori Algorithm

  • Concept: Uses frequent itemsets to generate association rules.
  • Frequent Itemset: Itemset with support value above a threshold.
  • Example: If {A, B} is frequent, then {A} and {B} are also frequent.

Manual Calculation Example

  • Steps: Transaction data -> Itemsets of size 1 -> Calculate support -> Eliminate low support items -> Continue process for larger itemsets.

Pruning

  • Removing itemsets with subsets below threshold support during calculation.

Example Calculation

  • Given transaction data, find frequent itemsets through several iterations (1-item, 2-item, etc.).
  • Generate rules based on support, confidence, and lift.
  • Example rule generation and support confidence values.

Implementation in Python (Jupyter Notebook)

  • Libraries: Pandas, MLXtend for apriori and association rules.
  • Data: Online retail data; cleaned, transformed, and analyzed.
  • Thresholds: Minimum support, confidence, and lift for rules.
  • Output: Frequent itemsets and their corresponding metrics.

Conclusion

  • Market basket analysis and apriori algorithms are powerful tools for retail analytics.

  • Helps in product placement, promotions, and sales strategies.

  • Real-world applications in stores like Walmart, Target, IKEA.

  • Encouragement to engage with the material and further explore practical applications.

Feedback: Open to questions and further engagement.