Significant testing - repeated observation over multiple days

I work in mobile gaming, and want to analyze A/B test groups, but I believe I'm introducing errors in my calculations.

The metric I'm looking at is: num of unique players who engaged in battle that day/ num of unique players who were active that day.

I currently have my data that with each row as active players for the day: date, player_id, group A/B, boolean 0,1 if engaged in battle that day I split the groups A/B and take the array of booleans and do a t-test. I feel that I'm overstating the sample size.

For example, I might have 1,000 people in Group A, and each day 300 are active from that group, of those 300 most of them are the same 200 people who are active everyday. I'm feeding 300 active players x 7 days = 2,100 rows of data into a t-test, but most of this is data from the same 200 people.

What is the correct way to do this?

Thank you!

Topic ab-test

Category Data Science


It depends on what you are testing and what your assumptions are.

Here are couple of scenarios:

  1. If you are looking at single day, then it does not matter.

  2. If you are comparing multiple days but looking at each day independently, they you might want to control for the multiple comparisons problem.

  3. If you are looking for changes over days, then you might want to treat it as a time series problem.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.