-
What
- the practice of showing two variants of the same web page to different segments of visitors at the same time and comparing which variant drives more conversions
-
A/B testing is a general methodology used online when testing product changes and new features
- A/B testing works best when testing incremental changes
- A/B testing doesn’t work well when testing major changes, like new products, new branding or completely new user experiences
- the basis for Data Driven Development
-
2 user experiences with random distribution of users
- randomness averages all other factors
- allows you to check the difference in user experience by one indicator
-
What to test
-
Attraction Marketing
- channels
-
Product solutions
-
think about the concept of product value
- think about ROI
-
technical solutions
- find strange scenarios
- understand the value of refactoring
- business model
-
Why
- to expand your business by acquiring new customers and build relationships by catering to existing ones
- Solve Visitor Pain Points
- Get Better ROI from Existing Traffic
- Reduce Bounce Rate
- Make Low-risk Modifications
- Achieve Statistically Significant Improvements
- Profitably Redesign your Website
- DataDD
-
Mistakes
-
Not Planning your Optimization Roadmap
- Invalid hypothesis
- Testing too Many Elements Together
- Ignoring Statistical Significance
- Using Unbalanced Traffic
- Testing for Incorrect Duration
- Failing to Follow an Iterative Process
- Using the Wrong Tools
-
Testing is carried out for a company that has not reached the required level of DDD
- tests are ineffective
-
"+"/ "-
-
And what else
-
learn more about our users
- to make strategic decisions
- insures against accidental product improvements
-
more flexible project infrastructure for release management:: quick release of changes
- Teams know the main metrics
- Team is in touch of changes and is included in the process
- formulation of hypotheses when setting tasks for the team, understanding the metrics, linking to the company's goals: understanding the relevance of the business goal
-
and what is bad
- DDD degrades the design
- it is very difficult to execute the test and it is easy to mistake the conclusions
-
cool idea, but does not affect the business
- very disappointing
- long, expensive, tiring
-
local optimum trap
- a big step is very expensive
- focus on quick and short-term understandable goals, no focus on long-term
-
Maths
-
general population and sample
-
general population
- all objects that interest us
-
sample
- based on the results, we try to draw conclusions for the general population
-
statistical methods
-
sample studies
- help us make informed decisions based on probabilities
-
estimation accuracy
-
confidence interval
- captures a certain amount of probability for a range
- count
- https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval
- https://sample-size.net/confidence-interval-proportion/
- https://www.calculator.net/confidence-interval-calculator.html
- c +- 1.645 * sqrt (c*(1-c)/N)
- c- conversion in sample
- 1.645 - coefficient depending on the level of trust (90%)
- N - number of observations
-
Result
-
Stat test
-
Task
-
determine the probability that the difference in results is due to product properties rather than random
- mistakes
- true negative
- no difference
- false negative
- Type II error: the test showed that there is no difference, but it is
- Experiment power (sensitivity)
- depends on the effect that is actually
- selected confidence level
- sample size
- effect size
- is inversely proportional to the probability of making the Type II error:
- the probability of overlooking the effect depends on the size of the actual effect
- false positive
- Type I error:: the test showed that there is a difference, but it is not
- confidence level
- the probability of making such a mistake
- p-value
- an estimate of the probability of obtaining the observed value by chance
- the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct
- true positive
- there is a difference
-
tools
- https://abtestguide.com/calc/
- https://abtestguide.com/abtestsize/
-
structure
-
Lets's use H0
- assumption: there is no difference and check if the data does not contradict this assumption
- Null hypothesis
- no difference
- Alternative hypothesis
- samples A and B are taken from populations with different distributions
-
options
- G-test
- XI ^2
- Student's T test
- for binary values
- for continuous variables
-
triangle
- confidence level and power level
- sample size
- registrable effect
- Subtopic 4
-
find a balance
- for a more sensitive experiment we increase the sample size
- if you reduce the number of observations, then the minimum effect will be greater
-
Let's Test
-
Research
- How the website is currently performing
- use Heatmap tools
- quantitative and qualitative research
- Observe and Formulate Hypothesis
-
Define Typical Values
-
Confidence level
-
90%
- Type I error: there is an effect, but in fact it is not
-
Power
-
80 %
- Type II error
-
experiment duration
- round up to whole weeks
-
Hypothesis type
- One-sided
-
Create samples
-
Let's calculate the size of the required sample
- https://abtestguide.com/abtestsize/
- find out the duration of the experiment
-
Run Test
-
Split URL Testing
-
How
- Split URL Testing is testing multiple versions of your webpage hosted on different URLs
- to compare two versions of a product
- to find out how changes in your product have affected its use: to compare the key product metrics for each version
-
Strategy
- Setting up pages for the Split URL test
- Adding conversion goals and estimating test duration
- Finalizing the test Previewing and starting the test
- Previewing and starting the test
-
Multivariate Testing (MVT)
- changes are made to multiple sections of a webpage, and variations are created for all the possible combinations
-
Multipage Testing
- to test changes to particular elements across multiple pages
-
Let's test our results
- https://abtestguide.com/calc/
-
estimate p-value
- https://abtestguide.com/calc/
-
use tools
-
calcs
-
AB- testguide
- https://abtestguide.com/calc/
-
GTM testing
- Google Tag Manager
- https://abtestguide.com/gtmtesting/
-
Bayesian A/B-test Calculator
- https://abtestguide.com/bayesian/
-
Optimizely
- https://www.optimizely.com/sample-size-calculator/
-
PlodCalc
- https://prodcalc.app/?fbclid=IwAR23UeOp1zau_itFWUGxehyG_saTaLTykTppnDsaYgwTXMwp6o33LGAqmiw
-
A/B Split & Multivariate Test Duration Calculator
- https://vwo.com/tools/ab-test-duration-calculator/
-
CLT for means
- https://gallery.shinyapps.io/CLT_mean/
-
Normal Table - z Table - Standard Normal Table - Normal Distribution Table
- http://www.normaltable.com/ztable-righttailed.html
-
Distribution Calculator
- https://gallery.shinyapps.io/dist_calc/
-
Sample Size Calculator (Evan’s Awesome A/B Tools)
- https://www.evanmiller.org/ab-testing/sample-size.html
-
CI-process
- Measure
-
Prioritize
-
CIE Prioritization Framework
- Confidence
- Importance
- Ease
- A/B test
- Repeat
-
We are working in changing and unpredictable environment
-
our changes in product
- better
- worse
-
hypothesis testing (typical)
-
User's feedback
- feedback is not always truthful and relevant
-
Sampling bias
- statistics of feature using
- biases
- not difference between correlation and causation
- survivor's mistake
-
Comparisons of events in time
- the product is influenced by many factors
- changes in competitors
- technical features, new technologies
- the product has become faster / slower
- seasonal demand
- pure chance
-
Evolutionary distortion
- we see patterns where there are none
- we see factors that confirm our correctness and do not see others
-
Sources
- https://vc.ru/flood/6371-ab-errors
- https://vwo.com/ab-testing/
- https://blog.hubspot.com/marketing/how-to-do-a-b-testing
- https://www.crazyegg.com/blog/ab-testing/
- https://medium.com/@robbiegeoghegan/implementing-a-b-tests-in-python-514e9eb5b3a1
- https://classroom.udacity.com/courses/ud257/lessons/4018018619/concepts/40043986940923
-
my sketch
- https://twitter.com/ManukhinaDarya/status/1295284820365520896?s=20