Sensitivity analysis in outlier explanation
I am trying to find the outlier explanation using the sensitivity analysis. Let’s consider that my dataset contains 19 different input values and 1 output value (So overall 20 different columns are there and values are numerical). I have already made a prediction model and I am considering the values with high prediction errors are outliers/ anomalies. I have done the sensitivity analysis for individual input values but in the dataset values are correlated with some other input values, e.g. value 1 is correlated with value 3,4,7; value 2 is correlated with 5,10,18 etc.
For outlier explanation, first I am checking if input values also contain any outlying inputs, if there are some then using sensitivity check I want to find if the values are more sensitive to the output value. Because the values are correlated with other inputs so individual sensitivity analysis does not make much sense, but in the end I want to find the most influential group of input values that makes the outlying value to normal. After that I will verify if it is valid for similar outliers and then I would provide the group as an explanation for the outlier. So my confusion is how I can check sensitivity of a group using in this case? If this approach does not sound logical, please let me know where does it sound confusing?
Topic outlier python bigdata data-mining
Category Data Science