Topic Hub

Experimentation & Statistics

Run experiments you can trust. Definitions and calculators for the statistics that decide whether a result is real — statistical significance, sample size, confidence intervals, control groups, and false positives — plus incrementality and A/B-test methodology for marketing teams that refuse to ship on noise.

15 curated resources across metrics, tools, templates, guides, and articles

Metrics & Definitions

Core glossary terms, formulas, and benchmark context for this topic.

Creative

A/B Testing

Controlled experimentation comparing two ad variants that differ by exactly one element to measure its impact.

Metrics

Statistical Significance

Measure of whether results are likely due to chance or a real difference.

p-value < Significance Level

Metrics

Sample Size

The number of observations in a sample, critical for the accuracy and reliability of statistical estimates.

n = (Z² * σ²) / E²

Metrics

Confidence Interval

Range of values likely to contain the true population parameter.

CI = Point Estimate ± (Critical Value × Standard Error)

Metrics

Margin of Error

Maximum expected difference between sample estimate and true value. A measure of the precision of a sample estimate.

Critical Value × Standard Error

Metrics

Control Group

A segment of users or data points that receive no treatment or intervention, serving as a baseline for comparison in experiments.

Lift = (Test - Control) / Control

Metrics

False Positive

An error in data analysis where a test incorrectly indicates the presence of a condition or effect that is not actually present.

FPR = FP / (FP + TN)

Metrics

False Negative

An error where a test incorrectly indicates the absence of a condition when it is actually present.

False Negative Rate = Missed Positives / Total Actual Positives

General

Incrementality

Measurement methodology that isolates and quantifies the true causal impact of marketing activities by comparing against a baseline.

Incremental Lift % = ((Treatment Conversions − Control Conversions) / Control Conversions) × 100

Tools & Calculators

Interactive calculators, analyzers, and generators to apply these concepts.

Templates

Downloadable templates that operationalize this topic in your workflow.

Guides

Educational guides and platform specifications.

Articles & Benchmarks

Deep dives, benchmark reports, and strategic analysis from the AdSights blog.

Overview

Most "winning" tests are not. The fastest way to ship false winners is to read results before a test has gathered enough data, to call any difference a result, or to confuse a metric that moved in-platform with revenue that actually grew. Trustworthy experimentation is the discipline of separating signal from noise — and it rests on three dials you set before the test starts, not after.

The first dial is statistical significance, usually expressed as 95% confidence (p < 0.05). It answers a narrow question: if there were truly no difference between A and B, how often would random chance alone produce a gap this large? It does not tell you the variant is better, by how much, or whether the difference matters to the business. The second dial is statistical power, conventionally 80% — the chance your test detects a real effect that exists. Underpowered tests are the silent killer: they quietly miss real wins and leave teams concluding "no difference" when there was one.

The third dial is sample size, and it is driven by your baseline conversion rate and the minimum detectable effect (MDE) you care about. Lower baselines and smaller effects both demand dramatically more traffic. As concrete anchors at 95% confidence and 80% power, a 5% baseline rate needs roughly 31,000 visitors per variant to detect a +10% relative lift, or about 8,100 to detect a +20% lift; a 1% baseline needs roughly 163,000 per variant for a +10% lift. Fix the sample size in advance, then resist peeking — checking repeatedly and stopping the moment you see significance inflates the false-positive rate far above the 5% you think you set.

Finally, a significant in-platform lift is not the same as incremental revenue. Platform attribution credits conversions that often would have happened anyway; the only way to know what your marketing truly caused is a holdout or geo experiment that compares an exposed group to an unexposed control. Use the calculators below to plan sample size and read significance, the glossary to ground the concepts, and incrementality testing to confirm that a statistically real lift is also a real business one.

Related Topics

Frequently asked questions