Lecture Notes: Designing an Ad Click Aggregator

Introduction

Designing an ad click aggregator, a common interview question

Understanding the Ad Click Aggregator

So let’s start looking into the end users, who are the users and how are they using the systems: This will sketch out some of the major components. So as an users, we have those clicking ads when they click an ad. They should be right, directed to advertise the website without issue user click on it or redirect it to the advertisement site.

We have the advertisers they need to track the performance of their ad campaigns, and this requires the ability to query quick matrix over a specific time period. With a certain level of detail. This gives rise to the second functional requirement advertises Cann query click matrix overtime with granularity down to one minute

So the fundamental purpose of an ad click aggregator is to collect an aggregate data on ad clicks, and provide this information to the advertisers, ideally in real time.

Collects and aggregates data on ad clicks
Logs user clicks on ads, providing advertisers with metrics

Functional Requirements

[[ So what do users expect from the system? Users can be external users, internal users, or can even be machines.What’s the expectation for this system:](streamdown:incomplete-link)

So the first one is the user clicks on an ad and the expectation is that the user gets directed to the advertisers website: you click on an sap ad, you get directed to SAP’s website

Second functional requirement, A more interesting , that advisers expect to query for click metrics overtime. For my SAP campaign I wanna know how many clicks we had last week, Maybe with a granularity over an hour. They zoom in the last day and they she was a granularity of one minute. So the one minute is a minimum granularity that we are going to support. ]]

User Redirection: Users click an ad and are redirected to the advertiser’s site.
Advertiser Queries: Advertisers can query click metrics over time with granularity down to 1 minute.
[[](streamdown:incomplete-link)

Out of Scope, but interesting tasks:

Ad targeting and serving
Cross-device tracking
Offline channel integration

[[So the wualities…what qualities does this system needs to have in order to provide for a good user experience. Users can be external users, internal users, or advertisers. What’s the expectation for this system:]]

Non-functional Requirements

[how do you come up with 10 million adds 10klicks pet second..lookup]]

Scalability: Support for 10 million ads, 10,000 clicks per second
[ we want to have low latency and analytic queries, the queries from advertisers… quantification less thsn 1 sec of course.. ]
Low Latency: Queries should have <1 second response time
[We don’t want to lose clicks, This oftentime influences how much to pay out advertisers or how much they need to pay. We wanna make sure that this data is accurate, Therefore, fault, tolerance, and data integrity]
Fault Tolerance & Data Integrity: Ensure no loss of click data
[we wanna make things as real time as possible the data should be up-to-date as much as possible the granularity]
Real-time Analytics: Data should be updated quickly (at least within 1-minute granularity)
[Scammers and if a user goes and does click click click click click that should be considered as one click we call this idempotency. Any given user should click once]
Idempotency: Prevent multiple clicks being counted as more than one

[ it is also worth to mention that as non-functional requirements, spam detection, demographic, profiling, or conversion tracking is not part of the scope here]

As the next step, system interface, what does mean?clearly outlining, what data the system receives, and what it outputs.

For example, the input to our system is going to be click data that is coming in from users, and also advertise a queries that is coming in on the other side by the advertisers. Those are the two inputs.

The outputs respectively from the users redirection from the advertiser the aggregated click metrics. ]

Data Flow

[Just a simple linear list of the steps to transform our input into output,](streamdown:incomplete-link)

First we have click data coming in into the system.
Next, we know that user is redirected,
Next, we wanna validate that click data. The idempotency concern we talked about in our nonfunctional requirement
The next thing we would do is, we would log the raw click data
The next thing is we need to aggregate it, “click data aggregated” Data comes in, we validate it, We log it and
we run some aggregation process over it In order to make sure we can put it in a format that’s eventually read optimized.

This is kind of high-level flow. Our system going to work.]

Click data comes into the system
User redirected
Data Validated, logged and aggregated

System Design

System Interface and Data Flow

Inputs: Click data from users, Advertiser queries
Outputs: User redirection to ads, Aggregated click metrics to advertisers

High-level Design

User Click Flow
- Ad placement service gives ads & metadata (ad ID, redirect URL)
- On click, ad ID sent to click processor
- Redirect URL fetched, user redirected to advertiser’s site via 302 redirect
Data Storage and Query
- Clicks stored in a write-optimized database (e.g., Cassandra)
- Slow aggregation queries on raw click data
- Solution: Pre-aggregation using Spark and storing in a read-optimized database (OLAP)

Deep Dives

Improving Real-time Capabilities

Use stream processing (e.g., Kafka/Kinesis) and stream aggregator (e.g., Flink)
Set aggregation window to 1 minute, flush intervals for near-real-time data

Scalability

Horizontal scaling of click processor and ad placement services
Shard data streams by ad ID to handle peak loads
Use of additional sharding techniques for popular ads (hot shard problem)

Fault Tolerance

Enable retention policies (e.g., 7 days) on streams
Consider checkpointing in stream processing
Implement periodic reconciliation to ensure high data integrity

Idempotency

Use ad impression IDs to prevent duplicate counting
Implement signing of impression IDs to prevent tampering

Conclusion

Various levels of depth expected for mid-level, senior, and staff candidates
Emphasis on demonstrating breadth and depth of understanding appropriate to candidate’s level
Encouragement to practice with mock interviews for preparation

Final Notes

Continued encouragement and resources available at hello interview.com
Feedback and community engagement are motivational and appreciated by Evan

The AWS managed API Gateway is included in the design to act as a central entry point and control plane for all incoming requests to the ad click aggregator system. It provides several key benefits:

Centralized Access Point: The API Gateway acts as a single point of entry for both user clicks (initiating redirection) and advertiser queries. This simplifies the system architecture and improves manageability.
Load Balancing: The API Gateway distributes incoming requests across multiple instances of the Click Processor service and other backend services. This ensures high availability and prevents any single service from being overloaded.
Security and Authentication: The API Gateway can implement various security mechanisms, such as authentication and authorization, to control access to the system’s internal components. This protects the backend services from unauthorized access.
Request Routing and Transformation: The API Gateway can route requests to the appropriate backend services and can transform the requests before sending them on, ensuring the backend services receive requests in the correct format.
Monitoring and Logging: The API Gateway provides built-in monitoring and logging capabilities, providing valuable information for troubleshooting and system optimization.

What would happen if the API Gateway were missing?

Without an API Gateway, several problems would arise:

Single Point of Failure: The backend services (Click Processor, Ad Placement, Query Service, etc.) would become directly exposed to the internet. This would create a single point of failure; if one of these services goes down, the entire system would be impacted.
Security Risks: The backend services would be vulnerable to direct attacks without the security measures provided by the API Gateway.
Load Imbalance: Incoming requests wouldn’t be properly distributed, leading to potential overload and performance bottlenecks in certain services.
Difficult Monitoring and Management: Monitoring and managing the system would become significantly more complex without the centralized monitoring and logging provided by the API Gateway.

In short, the API Gateway is crucial for security, scalability, manageability, and overall system robustness. Its absence would lead to a less secure, less scalable, and much more difficult-to-manage system, potentially resulting in performance problems and security vulnerabilities.