A/B testing screenshots is the single highest-ROI optimization you can run on a published app. Here is how to do it properly.

Both Apple App Store and Google Play have built-in A/B testing for store listings. Apple's feature is called Product Page Optimization (PPO). Google's is called Store Listing Experiments. Both let you serve different versions of your screenshots to different users and measure which converts better.

If you have a published app and you have never A/B tested your screenshots, you almost certainly have install conversion left on the table. This guide walks through the full process.

What you can test on each platform

Apple App Store — Product Page Optimization (PPO)

You can run up to three treatment variants against your original at the same time. Each variant can change:

Screenshots (full set)
App icon
App preview videos

Apple splits traffic between the original and the variants and reports the conversion rate for each. Tests run in App Store Connect.

Google Play — Store Listing Experiments

You can run an experiment with up to three variants against your default listing. Variants can change:

Screenshots
Short description
Long description
App icon
Feature graphic
Promo video

Tests run in Play Console under Store Listing Experiments.

What to test, in priority order

Don't test everything at once. The biggest wins come from testing variables in this order:

Screenshot 1 headline. The single highest-leverage variable on the entire listing. Test outcome-focused vs feature-focused, or short vs medium length.
Screenshot 1 visual. Hero image style and color treatment. Bold solid background vs subtle gradient. Light vs dark.
Order of screenshots. Move your strongest screenshot to position 1. Move your social-proof screenshot up.
Device frame on / off. Sometimes the bare screen out-converts the framed phone, sometimes the opposite.
App icon. Especially relevant for category-saturated apps where icon recognition matters.
Localization variant. Test a localized version against an English-only version in non-English markets.

How to set up a clean test

The single biggest mistake in screenshot A/B testing is changing too much at once. If your treatment changes both the headline and the visual, you can't tell which one moved the needle.

Rules for a clean test:

Change one variable at a time. Different headline. Same visuals. Or same headline, different visual.
Run for at least 7 days. Below 7 days, day-of-week effects can mislead you.
Wait for statistical significance. Both Apple and Google report confidence levels. Don't call a winner before the platform says it's significant.
Test in your largest market first. You need volume for significance. Test where you already have install traffic.

How long do tests need to run

It depends on your impression volume:

High traffic (10k+ store impressions/day): 7–14 days is usually enough.
Medium traffic (1k–10k/day): 14–30 days.
Low traffic (under 1k/day): 30+ days, and consider testing only large changes (different style entirely) since small variations won't move enough to detect.

If you don't have enough traffic for a credible test, focus first on growing impression volume (paid acquisition, content, ASO keyword optimization) and come back to A/B testing when you have signal.

Hypotheses worth testing

If you have never A/B tested screenshots, here are five hypotheses that frequently win on real listings:

Outcome headlines beat feature headlines. Test "Know where your money went" against "Track your spending".
Bolder colors beat subtle ones in search results. The eye picks out high-contrast thumbnails first.
Showing real user data beats fake placeholder data. "42 habits tracked" is more credible than "__ habits tracked".
Putting your strongest feature on screenshot 1 beats a generic hero. Lead with what only you do.
Localized screenshots beat English in every non-English market. Test it. The lift is usually significant.

Reading the results

Both platforms report:

Conversion rate per variant (impressions → installs)
Lift over baseline as a percentage
Confidence interval — typically reported at 90% or 95%

What "winning" means: a variant that beats baseline by more than the margin of error, with at least 90% confidence. If a variant is up 5% with a confidence interval of ±10%, it's noise — wait longer or run a different test.

What to do after a test ends

If a variant won: Promote it to the live listing. Then design the next test based on what you learned (e.g. if outcome headlines won, test which type of outcome).
If nothing won: Don't promote anything. Design a bigger test — change a more meaningful variable.
If the original won: That's still useful information. Your current listing is solid for that variable. Test a different variable next.

Tooling tip: how to produce variants quickly

The reason most teams don't A/B test screenshots is that producing each variant takes too long. If every variant requires a designer in Figma plus another export pipeline, the test never ships.

Use a screenshot tool that lets you duplicate a project, change one variable, and export the variant in store-ready dimensions in minutes. Launch Shots is built for this — duplicate a project, swap the headline or visual, export in App Store and Play Store dimensions, upload to App Store Connect or Google Play Console for the test. Each variant takes about 10 minutes once your base set is built.

▸ Try it now

Design store-ready screenshots in minutes — free, no watermarks.

Launch Shots is free forever. Every template, every device frame, AI localization to 100+ languages. No credit card needed.

Start creating free →

A simple A/B testing cadence

Set a quarterly cadence:

Month 1: Test screenshot 1 headline.
Month 2: Test screenshot 1 visual.
Month 3: Test screenshot order.

One year of disciplined testing usually produces a 30–60% lift in install conversion on a previously un-optimized listing. There is no other single change you can make to your app that has the same return.