Insights betwwen two columns/variables in Dataframe

I have data in two columns one is range of old credit score (Input score range) and new credit score (cvsc100). How do i find insights from both of them ? where the old is range of values and other column is not(CVSC100)

I know how to calculate Pearson Correlation in Python of Dataframe of two column . but that should not be sufficient i believe. How should i proceed can you please advise

Topic data-analysis descriptive-statistics python data-mining machine-learning

Category Data Science


The first step would be to bin cvsc100 in the same bin you have for your input, that will surely help your comparisons.

The second step would be to build a frequency table with your input range in columns and the binned CVS100 in rows, and at the intersection the count of values. This would help to observe how CVSC100 is distributed compared to your input and understand the underlying process.

Assuming that the score you have can be written as score for period n and period n+1, with some hypotheses on the stationarity of the underlying process and modulo some slight renormalisation you can get transfer probabilities and use this matrix for simulations to get the distribution of scores for period n+2,... and so on.

Without more data (explanatory variables or actual default values), you can't really get more usefull metrics.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.