📊

Lecture on A/B Testing Sample Size Calculation

Jul 17, 2024

Lecture on A/B Testing Sample Size Calculation

Introduction

Challenges in A/B testing: Calculating sample size.
Importance for product data science and interview preparation.

Sample Size Calculation

Definition: Calculating sample size based on statistical model, significance level, statistical power, and minimum detectable effect (MDE).
Common Formula:
- n = [(Z_critical_significance_level + Z_critical_statistical_power)² * Variance] / Delta²*

Key Components

Parameters in the Formula

n: Sample size per group (double for total experiment size).
Z critical value: Significance level (α) and statistical power.
Variance: Estimation necessary, detailed dive required.
Delta: Absolute difference between parameters, not synonymous with lift.

Significance Level (α)

Threshold for p-value to consider effect statistically significant.
p-value: Probability of observing the effect under null hypothesis.
Two-tailed test: α split into α/2 for both sides of the distribution.
Common Z-Scores:
- α = 0.01 -> Z = 2.58
- α = 0.05 -> Z = 1.96
- α = 0.10 -> Z = 1.64

Statistical Power (1 - β)

Definition: Probability of detecting an effect if it exists.
Regions under the curve: Type 2 error rate (β) and statistical power (1 - β).
Common Z-Scores for Power:
- Power = 0.80 -> Z = 0.84
- Power = 0.90 -> Z = 1.28
- Power = 0.95 -> Z = 1.64

Delta Calculation

Misconception: Delta ≠ Lift. Delta is the absolute difference.
Real-life example of the calculation:
- Baseline rate: 50%, Treatment rate: 55%, Lift = 10% -> Delta = 0.05.
- Formula for Delta²: (θ2 - θ1)²
In practice, for means: μ2 - μ1 and for proportions: P2 - P1.

Variance

Formula for one sample proportion: p(1-p).
Two sample case: Pooled variance P2(1-P2) + P1(1-P1).
For mean scenario: Sample variance s² = Σ_(i=1) [X_i - X̄]² / (K-1).
Approximation: Multiply baseline variance by two for pooled variance._

Approximation Formula

Common approximation: 16 * Variance / Delta².
Assumptions leading to 16 multiplier.
- Significance level (α) = 0.05, Power = 0.80.*

Example Calculation

Question Setup:
- MDE = 10%, α = 0.05, Power = 0.80, Baseline mean = 10, Variance = 20.
Steps to Solve:
- Calculate Z critical values and Delta, and pooled variance.
- Apply formula to get sample size per group; total size is twice that.
Result:
- Total sample size required = 628 (Control and Treatment combined).

Conclusion

Comprehensive understanding of sample size calculations in A/B testing.
Additional resources on dataentity.com A/B testing course for more details.

Full transcript