Market Basket Analysis and Apriori Algorithm
Introduction
- Context: Examples of purchasing behaviors in retail stores (e.g., buying iPhones in Target).
- Goal: Increase revenue by understanding and leveraging customer purchasing behaviors.
- Technique: Market Basket Analysis - finding associations between items.
Key Concepts
Market Basket Analysis
- Analyzes the purchasing patterns of customers.
- Uncovers associations between items frequently bought together.
- Examples:
- Customers buying bread and jam.
- Customers buying laptops and laptop bags.
- Helps in product placement and targeted marketing.
- Encourages additional spending with offers (e.g., bread, butter, and eggs).
Association Rule Mining
- Rule Structure: If (antecedent) -> Then (consequent).
- Example: If a customer buys
A
, they are likely to buy B
.
- Antecedent: Item/group of items found in the itemset.
- Consequent: Item/group of items that come with the antecedent.
- Goal: Identify co-occurrence patterns within transactions.
Algorithm and Metrics
Single and Multiple Cardinality
- Relationships between multiple items (single/multiple cardinality).
- Examples of sequences:
- A -> B
- A, B -> C
- A, B, C -> D
Measures: Support, Confidence, and Lift
Support
- Frequency of an item/combination of items being bought.
- Filters out items bought less frequently.
Confidence
- Likelihood of items occurring together given a specific antecedent.
- Focuses analysis on frequently bought item combinations.
Lift
- Strength of rule compared to random occurrences.
- High value indicates a significant association.
Example of Association Rule Mining
- Datasets: Items A, B, C, D, E; Transactions T1 to T5.
- Sample Rules: A -> D, C -> A, B, C -> A.
- Calculation: Support, Confidence, and Lift for given rules.
Apriori Algorithm
- Concept: Uses frequent itemsets to generate association rules.
- Frequent Itemset: Itemset with support value above a threshold.
- Example: If {A, B} is frequent, then {A} and {B} are also frequent.
Manual Calculation Example
- Steps: Transaction data -> Itemsets of size 1 -> Calculate support -> Eliminate low support items -> Continue process for larger itemsets.
Pruning
- Removing itemsets with subsets below threshold support during calculation.
Example Calculation
- Given transaction data, find frequent itemsets through several iterations (1-item, 2-item, etc.).
- Generate rules based on support, confidence, and lift.
- Example rule generation and support confidence values.
Implementation in Python (Jupyter Notebook)
- Libraries: Pandas, MLXtend for apriori and association rules.
- Data: Online retail data; cleaned, transformed, and analyzed.
- Thresholds: Minimum support, confidence, and lift for rules.
- Output: Frequent itemsets and their corresponding metrics.
Conclusion
-
Market basket analysis and apriori algorithms are powerful tools for retail analytics.
-
Helps in product placement, promotions, and sales strategies.
-
Real-world applications in stores like Walmart, Target, IKEA.
-
Encouragement to engage with the material and further explore practical applications.
Feedback: Open to questions and further engagement.