evaluation metrics for multiple values per session
I have an application that executes my foo() function several times for each user session. There are 2 alternate algorithms that i can implement as "foo" function and my goal is to evaluate them based on execution delay .
The number of times foo() is called per user session is variable but will not exceed 10000. Say delays values are:
Algo1: [ [12, 30, 20, 40, 24, 280] , [13, 14, 15, 100], [20, 40] ]
Algo2: [ [1, 10, 5, 4, 150, 20] , [14, 10, 20], [21, 33, 41, 79] ]
My question is whats the best metric to pick the winner ?
possible options
- average from each session, and then evaluate cdf
- median from each session and then evaluate cdf
- anything else ?
Topic distribution descriptive-statistics evaluation accuracy statistics
Category Data Science