Lecture Notes: Designing a Ranking Model for Instagram Feed
Introduction
- Speaker: Riam, a machine learning engineer with experience at Meta
- Focus on ranking and recommendation systems
Objective
- Design a ranking model for Instagram feed
- Focus on suggested posts from non-connected users
- Goal: Improve user engagement and daily active users (DAU)
Key Considerations
Business Requirements
- Improve engagement metrics like DAU and session time
- Engage users with suggested posts
Metrics
- Measure engagement at an individual level
- Align machine learning objectives with business goals
System Design Overview
Functional Requirements
- Build a model that enhances individual engagement
- Activities: Viewing, liking, commenting on posts
- ML objective correlates with business metrics
Non-functional Requirements
- System should be scalable and available
- Tools needed for debugging, monitoring, and ML Ops
- Possible analytics: Popular content creators, geographical distribution, trending posts
Pipeline Design
Phases
- Candidate Generation
- Ranking
- Post-Processing (Fairness, Diversity, Freshness)
Data and Features
Types of Features
- Viewer features: Interaction history, aggregated and delayed features
- Post features: Creator information, engagement history
- Embeddings from video, audio, text
Interaction Data
- Record labels of user post interactions (0 or 1)
- Use thresholds to define non-interaction events
Model Architecture
Two Approaches
-
Collaborative Filtering
- Matrix factorization using user-item interaction matrix
- Sparse data, approximates interactions
-
Two-Tower Neural Network
- Separate networks for viewer and post features
- Generate embeddings, use sigmoid to predict engagement
- Train using binary cross-entropy loss
Training and Evaluation
- Balance positive and negative samples for training
- Use metrics like AUC-ROC for model evaluation
- Conduct A/B testing to compare new models to existing ones
Serving and Post-Processing
- Generate candidate posts using embeddings and nearest neighbor search
- Rank based on engagement probability
- Post-process using rules for fairness and diversity
Evaluation and Continuous Learning
- Evaluate using safeguard metrics (e.g., report rates)
- Implement mechanisms for online learning to handle non-stationarity
Additional Considerations
- Cold start problem: Use popular posts for new users
Conclusion
- Importance of rapid iteration and comprehensive solution coverage
- Acknowledge non-functional requirements and real-world deployment challenges
These notes provide a comprehensive overview of the topics covered in the lecture and can serve as a helpful study aid for revisiting key concepts related to designing a ranking model for the Instagram feed.