Test and Control analysis to measure the impact of Change in sales-rep for territories

I hope you all are doing well.

Before I proceed with my problem statement, a few terminologies for reference -

  • Territory = Sales Territory - Think of it like a county/region assigned to a particular rep and no overlap of area/customers between 2 Reps
  • Rep/ Sales Rep = Sales Representative who visits customers to convert sales
  • Calls = Number of times a customer is visited by the rep in a month

  • Goal Attainment = % of Target achieved for the month - e.g. if Target was 500 units and total sales were 600, Attainment is 120%

I am working on a statistical/data-science problem and would like to get some thoughts on how to approach the hypothesis testing.

We have some attributes at Territory level, viz. - Total Sales, New to Brand Sales, Total Calls (by Reps) and % Goal Attainment, all rolled-up at Territory-Month level. I have data for 2-years, Jan 2018 to Jan 2020.

Now the problem I want to solve is to do a test control analyses to see if the Sales Rep change has any impact on the territory performance (sales) or not. The Test group would be a set of 30 Territories who have undergone a Rep change in the last 2 years (Rep changed at least 6 months ago, i.e. no later than July 2019) and the control group having similar territories without any change in Sales Rep for last 2 years.

I want to get some thoughts on how to find a matching control pair for each test territory. I have a list of 107 Territories with 30 having a rep change (basically test group) and remaining 77 available to form a control group. Since Sales are my target variable, I'm thinking of creating a composite score on normalized Calls and Goal Attainment and calculate distance from Mean (or Mean-Squared value) and pair the test territories with control ones having the least distance from mean for the pair.

After my test and control group is formed, I want to conduct hypothesis testing, the null hypothesis being Rep Change doesn't impact sales - for this, I'm planning to use two-tailed t-test (for n= 30) at 95% significance level. I would really appreciate your thoughts on this approach and If I could do anything else for robust testing.

Topic data-analysis experiments r

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.