How to export shap waterfall values to dataframe?
I am working on a binary classification using random forest model, neural networks in which am using SHAP to explain the model predictions. I followed the tutorial and wrote the below code to get the waterfall plot shown below
row_to_show = 20
data_for_prediction = ord_test_t.iloc[row_to_show] # use 1 row of data here. Could use multiple rows if desired
data_for_prediction_array = data_for_prediction.values.reshape(1, -1)
rf_boruta.predict_proba(data_for_prediction_array)
explainer = shap.TreeExplainer(rf_boruta)
# Calculate Shap values
shap_values = explainer.shap_values(data_for_prediction)
shap.plots._waterfall.waterfall_legacy(explainer.expected_value[0], shap_values[0],ord_test_t.iloc[row_to_show])
This generated the plot as shown below
However, I want to export this to dataframe and how can I do it?
I expect my output to be like as shown below. I want to export this for the full dataframe. Can you help me please?
update - error message
---- 1 pd.DataFrame({
2 Feature Name: [Base value] + [fFeature {i} for i in range(ord_test_t.shape[1])],
3 Contribution: (explainer.expected_value[0]) + list(shap_values[0])
4 })
~\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
634 elif isinstance(data, dict):
635 # GH#38939 de facto copy defaults to False only in non-dict cases
-- 636 mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
637 elif isinstance(data, ma.MaskedArray):
638 import numpy.ma.mrecords as mrecords
~\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in dict_to_mgr(data, index, columns, dtype, typ, copy)
500 # TODO: can we get rid of the dt64tz special case above?
501
-- 502 return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
503
504
~\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in arrays_to_mgr(arrays, columns, index, dtype, verify_integrity, typ, consolidate)
118 # figure out the index, if necessary
119 if index is None:
-- 120 index = _extract_index(arrays)
121 else:
122 index = ensure_index(index)
~\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in _extract_index(data)
672 lengths = list(set(raw_lengths))
673 if len(lengths) 1:
-- 674 raise ValueError(All arrays must be of the same length)
675
676 if have_dicts:
ValueError: All arrays must be of the same length