Using Transaction Amount to Guide Learning in an Fraud Detection Machine Learning Model

I am currently using transaction amount as a feature in an XGBoost classification model designed to identify fraudulent transactions. Furthermore, transaction amount is bounded for this problem between 0 and 500. Using transaction amount as a feature does improve target class separability. However, I can't help but wonder if there is a better way to use this variable. To explain, I care more about getting the high transaction amount values correct than I do the low ones. However, the model does not currently comprehend that. I have taught the XGBoost algorithm that the positive class is in effect more important by adjusting scale_pos_weight, but I haven't thought of a way to teach the algorithm that high transaction amount values are more important.

EDIT: I wanted to provide a bit more detail. After some additional reading, I think what I may be looking for is some kind of custom objective function. Possibly something like what is being discussed here.

Topic loss-function xgboost hyperparameter

Category Data Science


I see some approaches:

The first one you already mentionned, that is getting a metric that better fit your problem (overall cost ? overall cost relative to transaction involved ?). In your case that might translate into some weight in the eval metric. That is probably the better way to go.

Depending on the data you have, the algo you chose and the metric you retained you might also want to use some instance wiegths in the training phase. That is usually empirical (no rule for the formulation of weights).

Another thing, more important for explainability than pure performance, is how you take amount into account in feature engineering. Using amount as is might not be the best feature. You might want to consider things like amount/total_assets, amount/liquid_assets, amount/avg_account_amount or standardising other features by the amount like for exemple fees/amount. This usually help your model learn common features instead of over-fitting on some amount.


After some searching I have found example-dependent cost-sensitive classification as outlined in this thesis. There is also an associated python package (CostCla). Clearly, I just found this and am no expert, but these techniques appear to provide a method for training machine learning models using example-dependent costs.


Since you are using boosting models you can add something named monotonic constraints:

It is often the case in a modeling problem or project that the functional form of an acceptable model is constrained in some way. This may happen due to business considerations, or because of the type of scientific question being investigated. In some cases, where there is a very strong prior belief that the true relationship has some quality, constraints can be used to improve the predictive performance of the model.

I also recommend to read this post

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.