payment data prediction at test time

I have the payment data of the client. I want to predict the prob of customers paying late with target classes being 0-30 days, 30-60 days, 60-90 days, and 90+ days based on this paper.

The features I have are as follows:

Amount Payment Terms Diff in days Paid Invoices bef order Paid invoices late Ratio of paid inv which were late Sum of inv bef order Sum inv late Ratio of outs inv Avg days late outs Target
14298.0 45.0 177.0 0 0 0.001 1.429800e+04 14298.0 1.000000 177.0 90+ days

And at test time I have values as follows:

Amount Payment Terms Diff in days Paid Invoices bef order Paid invoices late Ratio of paid inv which were late Sum of inv bef order Sum inv late Ratio of outs inv Avg days late outs Target
24043.0 45.0 NaN 0 0 0.001 1.471976e+08 3067472.0 0.020839 NaN 90+ days

Data Description:-

  1. Amount: Amount to be paid
  2. Payment terms: a payment that needs to be done under the specified date
  3. Diff in days: Difference between clearing date and invoice date.
  4. Paid invoices bef order: Number of paid invoices before new order was created in next data 5. point
  5. Paid invoices late: Number of paid invoices that were late
  6. Ration of paid in which were late: paid invoices late/Paid invoices bef order
  7. Sum of inv bef order: sum of invoices before new order creation
  8. Sum inv late: sum of invoices that were late before new order was created
  9. Ratio of outs inv: Ratio of 8/7
  10. Avg days late outs: Average days late of all outstanding invoices that were late prior to a new invoice for a customer

the invoice information at test time doesn't have a clearing date hence the 'diff in days' and 'avg days late outs' column has nans.

How can I feature my test data if I don't know the clearing date? And how can i predict a future datapoint when I don't know such information which can help me in engineering features?

Topic methodology feature-engineering feature-extraction machine-learning

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.