scaling credit risk scorecard

I need to build a credit risk scorecard using logistic and linear regression. The variables using to predict are all dummies, where each dummy is a bin of some variable. Let's say the variable age, I have 4 dummy bins for "age": age_20 (1 when client has less than 20), age_2030 (1 when client betwen 20 to 30), age_3040 (same for 30 to 40), age_4050 (same for 40 to 50). I have more variables like this.

To run the linear or logistic regression I need to take out one dummy, because if not there will be colinearity on the variables. I also apply backwards selection, so only significant coeficients are kept. The thing is that afterwards, after getting the parameter estimates I need to scale them so each category would be a "score". The score formula is as follows:

-(woe*Beta+alpha/n)*factor+offset/n

where the WOE is a statistic for each bin, n the number of variables. My question is, what should I do with the base case when it comes to the dummies? And, let's say that my regression removed 2 more dummies because they were non significant, what does that mean to the score calculation?

Topic linear-regression scoring sas logistic-regression

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.