Skip to main content
data outliers graphic

Why does Winsorization matter in A/B testing?

September 18, 2024
Reading time: 
5 mins
headshot of Nicholas Bevilacqua
Nicholas Bevilacqua
Nicholas Bevilacqua is a Product Marketing Manager at Kameleoon, where he passionately communicates about Kameleoon's innovative features.

In A/B testing, we compare two or more variants to see which performs better. While data is usually reliable, industries like gaming or online betting can face challenges. A few users making large transactions can skew the results, distorting the true performance of your variants.

That’s where Winsorization comes in. By reducing the impact of extreme values, it helps provide more accurate and actionable insights, especially with continuous metrics like revenue.

But what exactly is Winsorization? And how can you get the most out of it for your experimentation program?

In this article, we define Winsorization, explain where and when to use it for best results, and highlight how Kameleoon supports this testing approach:

What is Winsorization?

Winsorization is a statistical technique that limits extreme values in your data by using percentiles, reducing the impact of outliers. Outliers can skew A/B test results, but by Winsorizing your data, you ensure your results are more robust and reliable.

When to use Winsorization

Winsorization is helpful when your data has extreme values that aren't errors but still differ significantly from other observations.

This technique is especially useful in industries with high transaction variability, like gaming or online betting. It’s a simple way to manage outliers while maintaining data integrity.

However, it’s less relevant for binary outcomes like conversion rates, where outliers are minimal.

Winsorization improves the accuracy of A/B tests

Outliers can skew A/B test results, often coming from data entry errors, unusual user behavior, or rare events.

This is particularly relevant in industries with high transaction variability. Winsorization tackles this by capping extreme values at a set percentile, making sure the data better reflects the typical user experience.

​​When your data more accurately reflects user behavior, the conclusions from A/B tests become more reliable. Winsorization cuts down on noise, stabilizes metrics, and improves comparability by applying the technique consistently across all variants.

However, for simpler cases like binary metrics (e.g., converted or not converted), outliers are less of an issue, so Winsorization might not be needed.

Here are some ways Winsorization leads to more accurate A/B testing results:

Enhances robustness of test results

Winsorization strengthens your test results by preventing a few extreme values from skewing the data. This added robustness is crucial for making informed decisions that aren’t overly influenced by anomalies.

Simplifies A/B testing environments for all types of users

Winsorization is simple and easy to implement, making it accessible for users of all statistical skill levels. It’s an efficient way to manage outliers without needing complex algorithms or deep statistical knowledge.

Risks and best practices with Winsorization

Risks of Winsorization

  • Loss of data integrity. Overusing Winsorization can overly alter your data, potentially hiding important variations and patterns.
  • Oversimplification. By modifying outliers, you may oversimplify your data, which can result in an incomplete understanding of your dataset.
  • Bias introduction. Inappropriate Winsorization thresholds can introduce bias, skewing your results and leading to incorrect conclusions.

Best practices for Winsorization

  • Understand your data. Thoroughly understand the nature and distribution of your data before applying Winsorization.
  • Set appropriate thresholds. Set your Winsorization thresholds using industry standards or data-specific insights. For example, using symmetrical thresholds prevents skewing the central tendency, helping maintain the balance and integrity of your data.
  • Evaluate impact. After applying Winsorization, evaluate its impact on your data and test your results.
  • Document your process Document your Winsorization process thoroughly, noting the reasons for your chosen thresholds and how they impact your data.

How to apply Winsorization in Kameleoon

Identify outliers in your data

First, your team identifies outliers in your data through your Custom Goal. Outliers are typically those values that fall outside a specific range— traditionally the top and bottom 0.1% or 99.9% of your data.

Set limits to account for outliers

Next, you set limits to account for identified outliers. For example, with 95% Winsorization, any data below the 2.5th percentile is adjusted to the value corresponding to this 2.5th percentile, and anything above the 97.5th percentile is set to that value. You can find these bounds in the Advanced Settings of your Custom Goal.

Read your results in the Result Pages

After applying Winsorization to your Goal, you can view the adjusted results on the various Results Pages. The Goal container will have a badge indicating that outliers are managed for this Goal. Hover over the badge to see the parameters you’ve set.

Example of the impact of Winsorization on A/B test results

Imagine an A/B test comparing two landing pages (A and B). Without Winsorization, a few high-value outliers—like purchases from a few big spenders—could make one page seem much more effective, even if most users don't behave that way.

Original Data (Metric: Revenue per User):

  • Page A: [10, 12, 14, 15, 16, 18, 100]
  • Page B: [11, 13, 15, 15, 17, 19, 110]

 

Average Revenue per User:

  • Page A: 27.86
  • Page B: 28.57

 

In this example, we apply Winsorization at the 95% level, meaning we cap the extreme 2.5% on both ends. This sets our limits as the 2.5th percentile and the 97.5th percentile of the data. Let’s assume these percentiles correspond to the following limits:

  • Lower limit (2.5th percentile): 10
  • Upper limit (97.5th percentile): 18 for Page A and 19 for Page B

 

Now, we adjust any value outside these limits to the nearest limit.

Winsorized Data:

  • Page A: [10, 12, 14, 15, 16, 18, 18] (the 100 is replaced by 18)
  • Page B: [11, 13, 15, 15, 17, 19, 19] (the 110 is replaced by 19)

 

Winsorized Average Revenue per User:

  • Page A: 14.71
  • Page B: 15.57

 

In this case, without Winsorization, Page A initially appears to be performing better due to the high-value outlier. However, after Winsorization, Page B emerges as the more effective option, providing a more accurate comparison of typical user revenue between the two pages.

Technical considerations of Winsorization with Kameleoon

When you apply Winsorization to your Custom Goal, Kameleoon calculates and stores the percentile values you set and uses them to adjust your data on the Results Pages. These values are updated daily.

However, if you change the percentiles in the Edit panel, the new values are calculated immediately—no need to wait. The raw data remains unchanged.

The power of Winsorization in Kameleoon A/B testing

Winsorization is a powerful tool in our A/B testing platform for managing outliers. By limiting the impact of extreme values, it keeps your test results accurate, robust, and actionable.

Understanding the risks and following best practices lets you use Winsorization effectively without compromising your data's integrity.

Curious about how Kameleoon handles continuous metrics? Check out our blog to see how you can use both frequentist and Bayesian methods for more robust data analysis.

Topics covered by this article
headshot of Nicholas Bevilacqua
Nicholas Bevilacqua
Nicholas Bevilacqua is a Product Marketing Manager at Kameleoon, where he passionately communicates about Kameleoon's innovative features.