Lecture on Cargo Competitions for Click Prediction
Overview
Today's talk covers three main Kaggle competitions focused on click prediction:
- Display Advertising Challenge
- Outbrain Click Prediction
- Mobile Ad Click Prediction
1. Display Advertising Challenge
- Timeframe: 5-7 years ago
- Objective: Predict if a user will click on a given ad
- Context Data:
- User details
- Visited page details
- Click label (1 for click, 0 for no click)
Data Features
- L1 to L13: Integer values, mostly counts
- C1 to C26: Categorical values, kept anonymous for user safety
Dataset Size
- Training data: 45 million samples
- Prediction task: 6 million samples
- Post one-hot encoding: 33 million features, making it sparse
Evaluation Metric
- Log loss: Measures the accuracy probability of the binary classification
Winning Strategy: Team '3 Idiots'
- Workflow:
- Pre-process data to create 39 features
- Use Gradient Boosted Decision Trees (GBDT) to generate features
- Transform data using GBDT
- Train Field-aware Factorization Machine (FFM)
- Calibrate outputs for final results
Detailed Steps
- GBDT: Trains trees iteratively to reduce residual errors
- Sparse to Dense Encoding: N trees with depth D results in 2^D leaf nodes per tree
- Log Transformation & Grouping: Log transform numerical features, group rare categorical ones
- Hashing: Converts text features into integers via hash functions
- FFM: Decomposes interaction terms into field-aware matrices, enhancing feature representations
2. Outbrain Click Prediction
- Objective: Recommend content on news channels based on user behavior
- Platform: Content discovery, pops up as embedded guide in news articles
Data Features
- User's Page Views and Clicks: Tracks historical views, documents, platform, location, traffic source
- Click Data: Display IDs, ad IDs, click status, metadata
- Document Meta-data: Publisher channel, publish time, topics, entities, categories
Evaluation Metric
- Mean Average Precision at 12: Measures prediction precision across the top-12 ranks iteratively
Third Place Solution
- Feature Extraction & Model Stacking:
- FFM with soft max click probability & pairwise rank loss
- Extra Boost with pairwise rank loss
- Additional Features: Page view counts, ad landing page views, impressions by ad, document vectors
- Encoding Strategy: Aggregates user historical document vectors, using inner product for similarity comparison
3. Mobile Ad Click Prediction
- Objective: Predict if a mobile ad will be clicked
Data Features
- Ad Identifiers and Click Status
- Time Details: Year, month, day, hour
- Categorical Variables: Anonymized, includes site, app, device details
Evaluation Metric
- Log loss: Again used for model evaluation
Winning Strategy: Expanded '3 Idiots' Team
- Feature Engineering & Model Ensembling
- Generated Features: Count features, bag features, click history
- Hashing Trick: Used to transform text features
- Advanced Encoding and Model Averaging: Uses logistic function-based geometric averaging
Conclusion
- Importance of FFM: Found highly effective across multiple competitions
- Feature Engineering Critical: For model performance
- GBM and Hashing Functions: Useful for feature transformation
- Current Trends: Likely shifted towards deep learning, can explore recent advancements via AdKDD workshop
Questions & Wrap-Up
- Open floor for questions and comments on the discussed topics.