payment data prediction at test time
I have the payment data of the client. I want to predict the prob of customers paying late with target classes being 0-30 days, 30-60 days, 60-90 days, and 90+ days based on this paper.
The features I have are as follows:
Amount | Payment Terms | Diff in days | Paid Invoices bef order | Paid invoices late | Ratio of paid inv which were late | Sum of inv bef order | Sum inv late | Ratio of outs inv | Avg days late outs | Target |
---|---|---|---|---|---|---|---|---|---|---|
14298.0 | 45.0 | 177.0 | 0 | 0 | 0.001 | 1.429800e+04 | 14298.0 | 1.000000 | 177.0 | 90+ days |
And at test time I have values as follows:
Amount | Payment Terms | Diff in days | Paid Invoices bef order | Paid invoices late | Ratio of paid inv which were late | Sum of inv bef order | Sum inv late | Ratio of outs inv | Avg days late outs | Target |
---|---|---|---|---|---|---|---|---|---|---|
24043.0 | 45.0 | NaN | 0 | 0 | 0.001 | 1.471976e+08 | 3067472.0 | 0.020839 | NaN | 90+ days |
Data Description:-
- Amount: Amount to be paid
- Payment terms: a payment that needs to be done under the specified date
- Diff in days: Difference between clearing date and invoice date.
- Paid invoices bef order: Number of paid invoices before new order was created in next data 5. point
- Paid invoices late: Number of paid invoices that were late
- Ration of paid in which were late: paid invoices late/Paid invoices bef order
- Sum of inv bef order: sum of invoices before new order creation
- Sum inv late: sum of invoices that were late before new order was created
- Ratio of outs inv: Ratio of 8/7
- Avg days late outs: Average days late of all outstanding invoices that were late prior to a new invoice for a customer
the invoice information at test time doesn't have a clearing date hence the 'diff in days' and 'avg days late outs' column has nans.
How can I feature my test data if I don't know the clearing date? And how can i predict a future datapoint when I don't know such information which can help me in engineering features?
Topic methodology feature-engineering feature-extraction machine-learning
Category Data Science