Insights betwwen two columns/variables in Dataframe

Question

Insights betwwen two columns/variables in Dataframe

marco

2022年4月3日 20:07

I have data in two columns one is range of old credit score (Input score range) and new credit score (cvsc100). How do i find insights from both of them ? where the old is range of values and other column is not(CVSC100)

I know how to calculate Pearson Correlation in Python of Dataframe of two column . but that should not be sufficient i believe. How should i proceed can you please advise

Topic data-analysis descriptive-statistics python data-mining machine-learning

Category Data Science

lcrmorin · Accepted Answer · 2020年3月4日 16:49

The first step would be to bin cvsc100 in the same bin you have for your input, that will surely help your comparisons.

The second step would be to build a frequency table with your input range in columns and the binned CVS100 in rows, and at the intersection the count of values. This would help to observe how CVSC100 is distributed compared to your input and understand the underlying process.

Assuming that the score you have can be written as score for period n and period n+1, with some hypotheses on the stationarity of the underlying process and modulo some slight renormalisation you can get transfer probabilities and use this matrix for simulations to get the distribution of scores for period n+2,... and so on.

Without more data (explanatory variables or actual default values), you can't really get more usefull metrics.

Insights betwwen two columns/variables in Dataframe

About