target encoding with multiple columns
I'm attempting to do target encoding with multiple columns on a dataframe and I'm getting an error message I don't understand.
Here is a fragment of the code.
X['District Code Encoded'] = encoder.fit_transform(X['District Code'], y)
X['Property id Encoded'] = encoder.fit_transform(X['Property id'],y)
X['Property name Encoded'] = encoder.fit_transform(X['Property name'],y)
It always runs the first line and then throws an error message on the second line giving a key error along with the key that occurs in the second pair of square brackets on the first line. So for example, in this case, Key Error: 'District Code'.
I can show more code or more details of the error message if need be.
Is it possible to work out from that what might be going wrong here?
Added later: Here is a fragment of code added later to try to find the bug.
encoded_df['District Code Encoded'] = encoder.fit_transform(X['District Code'], y)
for col in X.columns:
print(col)
print(X)
dataset=[['tom',10,7],['patrick',15,8],['john',25,11]]
Y = pd.DataFrame(dataset, columns = ['a','b','c'])
encoded_df2 = encoder.fit_transform(Y['b'],Y['c'])
encoded_df2 = encoder.fit_transform(Y['a'],Y['c'])
print('That is done')
It produces same error message again, ending with KeyError: 'District Code', so it seems as though something's happening where when you run this function twice you get a Key Error from the first time you ran it. Would it be a case of needing to see the original dataframe X in order to understand why that error is generated?
Topic target-encoding
Category Data Science