evaluation metrics for multiple values per session

Question

evaluation metrics for multiple values per session

sbr

2021年12月8日 11:04

I have an application that executes my foo() function several times for each user session. There are 2 alternate algorithms that i can implement as "foo" function and my goal is to evaluate them based on execution delay .

The number of times foo() is called per user session is variable but will not exceed 10000. Say delays values are:

Algo1: [ [12, 30, 20, 40, 24, 280] , [13, 14, 15, 100], [20, 40] ]
Algo2: [ [1, 10, 5, 4, 150, 20] , [14, 10, 20], [21, 33, 41, 79] ]

My question is whats the best metric to pick the winner ?

possible options

average from each session, and then evaluate cdf
median from each session and then evaluate cdf
anything else ?

Topic distribution descriptive-statistics evaluation accuracy statistics

Category Data Science

Brian Spiering · Accepted Answer · 2020年10月23日 13:02

It is common to look at 90th or 99th percentile latency in computer systems.

A user won't notice the difference between a couple of milliseconds of lag but if a function occasionally takes several seconds that is very noticeable.

Noah Weber · Accepted Answer · 2019年12月30日 09:19

Here is a suggestion:

Standardise everything (if you ommit this than some big number like 9999 can ruin everything), than take average value per user session. Than, optionally, mutliply this number by x/10 for example, where x is the sample size in the use session (think of it like evidence where more samples add more confidence) and finally average by number of sessions for the algorithm.

evaluation metrics for multiple values per session

About