ab-test

A/B testing with non-Gaussian distributions

anishtain4

2022年5月6日 06:01

I have two sets of samples (A, B) with a relatively high number (~10,000) and I want to see if a factor has affected sample B or not. Naturally, I should use A/B testing. The problem is, the distributions are not normal and I'm interested in the maximum change, not the mean values! So if all you know is how CLT is gonna make everything Gaussian, this is a good point to stop and move on to the next question. …

Topic: ab-test statistics

Category: Data Science

A/B Testing (Binomial Distribution vs Random Distribution)

DD.

2022年4月28日 17:04

When performing an A/B test for the number of clicks for users viewing (each view is an impression) two variants of an ad, a binomial distribution can be assumed where each variant has a constant click-through rate. Example: Two Ads, -> Ad one has 1000 impressions and 20 clicks, CTR is 2%; -> Ad two has 900 impressions and 30 clicks, CTR is 3.3%. Test whether there is a difference between Click Through Rate (CTR) between Ads one and two. …

Topic: distribution descriptive-statistics ab-test statistics

Category: Data Science

Hypothesis test for classification model

william007

2022年4月25日 09:33

I have a model that outputs 0 or 1 for interest/not-interest in a job. I'm doing an A/B/C test comparing two models (treatment groups) and none (control group). ANOVA for hypothesis testing and t-test with Bonferroni correction for posthoc testing is my plan. But both tests assume normality. Can we have normality for 0 and 1? If so, how? If not, what's the best test (including posthoc)?

Topic: anova hypothesis-testing ab-test classification

Category: Data Science

A/B test results contradictory with offline machine learning model performance

CathyQian

2022年4月21日 23:00

This seems to be a common problem when bringing machine learning models to production. Let's say we have an optimized machine learning model which gives decent performance metric in the unseen testing dataset. We are quite satisfied with that, and decided to bring the model online. Then we use A/B test to compare our website performance (i.e., revenue, customer engagement etc) with and without the new model. Somehow, our new model is not a clear winner or even a clear …

Topic: ab-test machine-learning

Category: Data Science

Can I use multi armed bandits to optimize how much both algorithms are weighted when creating a composite score?

madst

2022年3月17日 19:20

So, I'm aware that multi-armed bandits are great for evaluating multiple models and from what I understand, it is mainly used to pick a specific model. I would still like to evaluate two models but I want to do it differently. Take a look at this simple equation: W_A * RecoScore_A + W_B * RecoScore_B = CompScore Rather than optimize for a specific model for a given user, I'd like to optimize for a given set of weights. I'm wondering …

Topic: ab-test experiments recommender-system

Category: Data Science

Significant testing - repeated observation over multiple days

user54052

2022年3月11日 20:00

I work in mobile gaming, and want to analyze A/B test groups, but I believe I'm introducing errors in my calculations. The metric I'm looking at is: num of unique players who engaged in battle that day/ num of unique players who were active that day. I currently have my data that with each row as active players for the day: date, player_id, group A/B, boolean 0,1 if engaged in battle that day I split the groups A/B and take …

Topic: ab-test

Category: Data Science

A/B test on model - split on results

Matteo Felici

2022年2月14日 15:08

I developed a predictive model that assigns the best product (P1, P2, P3) for each customer. I wanted to compare the conversion rate using this model VS the as-is deterministic assignment, so I applied an A/B test: I decided the product between P1, P2, P3 using the model on 50% of my users using the deterministic rules on the other 50% and then I compared the different conversion rates. My question is: is it correct to split the analysis on …

Topic: ab-test predictive-modeling

Category: Data Science

What is the minimum size of the test set?

Bob

2021年12月19日 16:20

The mean of a population of binary values can be sampled with about 1000 samples at 95% confidence, and 3000 samples at 99% confidence. Assuming a binary classification problem, why is the 80/20% rule always used, and not the fact that with a few thousand samples the mean accuracy can be estimated with > 95% confidence?

Topic: ab-test cross-validation statistics

Category: Data Science

Causal Inference where the treatment assignment is randomized

manish Prasad

2021年12月8日 18:16

I have mostly worked with Observational data where the treatment assignment was not randomized. In the past, I have used PSM, IPTW to balance and then calculate ATE. My problem is: Now I am working on a problem where the treatment assignment is randomized meaning there won't be a confounding effect. But treatment and control groups have different sizes. There's a bucket imbalance. Now should I just analyze the data as it is and run statistical significance and Statistical power …

Topic: causalimpact ab-test python statistics

Category: Data Science

AB testing split algorithm

AO1992

2021年10月20日 12:29

I want to understand what is the most effective algorithm for splitting. I have ids of users and I want to split them into 2 groups. Now I have 2 variants: Modulo approach - let's say we will place all even ids into one group, odd numbers into another. Pros - for any sequence we will have a uniform distribution of users. So for any day or hour, users that registered during that time will be equally divided between 2 …

Topic: hashing-trick ab-test

Category: Data Science

Multivariate testing

David Masip

2021年10月19日 07:16

I'm going to run a test with 4 different variants (3 variants and a control group), and we want to find the variant with the highest conversion. Are there any resources/methods in R/python to: Perform a test to tell if a variant converts significantly better than the others? Calculate sample size before performing this test? Either frequentist or Bayesian methods work for me, thanks! The context is that the amount of data is not huge, I have around 5000 users …

Topic: bayesian ab-test

Category: Data Science

Intragroup indepence in two groups analysis

Rodrigo Serna Pérez

2021年9月2日 17:35

I am working in an experiment in which I want to analyze the impact of a treatment on two different groups of customers. Most of the method for analysis I have checked (for example t-test) have as a hypothesis the existence intragroup and crossgroup independence. I can assume the crossgroup independence because the two groups are randomly split, but I have some doubts about the meaning of the intragroup independence. We can assume that there is no causal effect of …

Topic: chi-square-test ab-test

Category: Data Science

How do I conduct an experiment on the new pricing if it's impossible to conduct an A/B test?

Никита Юсупов

2021年8月24日 22:04

We want to introduce a new price list for the customers of our international SaaS company. Beforehand we want to test this new price list in several countries. A/B test cannot be conducted here because it is forbidden to have different prices for different customers of the same country. Thus we want to introduce the new pricing policy in several countries and then figure out whether the new one is better than the old one. My questions are: How to …

Topic: hypothesis-testing data-analysis ab-test experiments statistics

Category: Data Science

Practical constraints in A/B testing

user9343456

2021年6月19日 12:28

I saw an article about an A/B test that google had performed way back. They wanted to decide what shade of blue a button should be and how that affects click-through rate. They divided users randomly into 100 buckets - each corresponding to a shade of blue they wanted to check (so the color is a factor with 100 levels). Now this is all well and good if all the buckets (or "treatment groups") sufficiently represent the target population. In …

Topic: ab-test experiments

Category: Data Science

How to create A/B test segements for highly variable data

MogambO

2021年3月29日 23:01

I have a data in which there is a high degree of variability. My Objective is to do an AB test to check the behavior change due to new changes. All samples has shown historically high and low performances. This means if I take any 2 cohorts randomly, they show vast historical comparison difference Following is the example for weekly comparison. Same behavior holds true for monthly and daily too. W1: -10.04% W2: 3.9% W3: -4.2% W4: -3.7% W5: 5.4% …

Topic: ab-test

Category: Data Science

What is the right approach to bucket users for algorithms with different coverage for A/B testing

0xF

2021年3月26日 15:47

I've couple of recommendation algorithms that I want to A/B test. Algorithm A has 90% user coverage and algorithm B has 95% user coverage. That means if the algorithms are asked to provide recommendations for 1000 users, algorithm A can give it for 900 of the users and algorithm B can give it for 950 other users. Say for example out of these 1000 users 87% has recommendations from both algorithm, 3% has recommendations from only algorithm A and 8% …

Topic: ab-test experiments recommender-system

Category: Data Science

Recommend System AB test metric events

Ruslan

2021年3月26日 09:02

I build personal recomendation system for choosing games. In website on main page on special place there is collection of personal games recomendation. And after AB test(between 2 recommend system) I don't understand, what events I should collect. Only events after click on recomend icon or all events(recommend events plus events without choosing recommend game-user can choose game on other places such as finder)? For example, one of the metric is sum payment per user per game. Should I collect …

Topic: metric ab-test recommender-system

Category: Data Science

A/B testing: How to calculate p-value on post test segments?

jxn

2020年12月9日 07:25

My question on A/B testing is about doing post test segmentation analysis. For example: I run an A/B test on my website to track bounce rate. On the treatment group, i put a video to explain my company. On the control group i put just plain text. I pick a segment of users who are first time users from USA to be split 50/50 into the 2 groups. Metric that i am tracking is average bounce rate (assume 20%). Power …

Topic: hypothesis-testing ab-test experiments statistics

Category: Data Science

Time duration for ML models A/B testing

itsMe

2020年9月9日 23:11

I am going to perform A/B tests for ML models. However I am not sure how long should I run it online in order to see significant differnce. What would be the right time frame ? and what will be the reason behind the time frame ? The A/B test will run againts the None ML systems. Usally we run for none ML features for 2 weeks max. Thank you

Topic: ab-test machine-learning

Category: Data Science

Treatment and Control selection in A/B Testing

Egodym

2020年8月17日 20:11

I'm hoping to get a better understanding of A/B Testing design. In particular, I'm interested in understanding how treatment and control units are selected. I read that these 2 groups are selected randomly (for example, here), but then there are also approaches where after picking the treatment (either randomly or not) the control is selected based on "similarity" to the treatment group. Are both approaches valid and what's the rationale for picking one or the other? For example, Alteryx has …

Topic: causalimpact ab-test statistics

Category: Data Science

About