spearmans-rank-correlation

Correlation analysis yields conflicting results. Positive Pearson and Negative Spearman

No-Time-To-Day

2022年3月29日 11:38

I have four features x1,x2,x3,x4 all of their correlation with y are similar in Pearson and in Spearman separately. However, all these are +0.15 in Pearson and -0.6 in Spearman, more or less. Does this make sense? How can I interpret this result? All four features and target are definitely related. From a common sense perspective the sign of Spearman is more accurate as well.

Topic: spearmans-rank-correlation data-analysis pearsons-correlation-coefficient correlation

Category: Data Science

Rank correlation with spearman and kendall

Mir

2022年3月18日 18:26

While interpreting the correlation between ranks, should I use the rho value (for spearman method), tau value (for kendall's tau method), w value ( for kendall's w method) or should I take in consideration the p-value? And does having NaNs values in the ranks impact the interpretation of the correlation?

Topic: kendalls-tau-coefficient spearmans-rank-correlation correlation

Category: Data Science

What statistical method should i use to find Correlation between number of days and AmountEarned

LordoftheRingYVR

2022年3月11日 17:07

I am new to Data Science and I have a python data frame with Number of days, CountofJobs, and AmountEarned what statistical method should I use to find a correlation between Days and AmountEarned. NumberofDays CountofJobs AmountEarned 20 3 50000 22 18 10000 35 10 80000

Topic: spearmans-rank-correlation pearsons-correlation-coefficient correlation

Category: Data Science

Correlation with target variable for regression problem

william007

2022年2月13日 06:38

Given the following dataframe age job salary 0 1 Doctor 100 1 2 Engineer 200 2 3 Lawyer 300 ... with age as numeric, job as categorical, I want to test the correlation with salary, for the purpose of selecting the features (age and/or job) for predicting the salary (regression problem). Can I use the following API from sklearn (or other api) sklearn.feature_selection.f_regression sklearn.feature_selection.mutual_info_regression to test it? If yes, what's the right method and syntax to test the correlation? Following …

Topic: spearmans-rank-correlation pearsons-correlation-coefficient correlation scikit-learn feature-selection

Category: Data Science

How to assess whether neural network performance is associated with a nuisance variable

srcerer

2021年12月6日 21:09

Problem I have a convolutional neural network model which intakes a video and outputs a continuous variable. I want to assess whether the performance of the model is associated with another continuous variable (age; not included in the model). Solution attempt If this were a linear regression model, I think I could do a Spearman rank correlation test: basically, plot the absolute value of the residuals (true value - predicted value) against the nuisance variable (age), then calculate the Spearman …

Topic: spearmans-rank-correlation convolutional-neural-network

Category: Data Science

Analysing process data with sub groupings and checking for correlation

Aesir

2021年8月12日 08:17

I have a dataset of process data for different equipment with many sensors. I would like to check the correlation of the different sensors to see if there is any strong correlation between some sensors and potentially reduce the size of my dataset. Within this process data there are many different processes of varying lengths and different equipment. For now I am asserting that the different equipment shouldn't make a difference and therefore I do not want to include this …

Topic: spearmans-rank-correlation pearsons-correlation-coefficient correlation pyspark

Category: Data Science

Pearson vs Spearman vs Kendall

2021年5月22日 10:09

What are the characteristics of the three correlation coefficients and what are the comparisons of each of them/assumptions? Can somebody kindly take me through the concepts?

Topic: kendalls-tau-coefficient spearmans-rank-correlation pearsons-correlation-coefficient correlation

Category: Data Science

Correlation/distance between sparse vectors

Roger Vadim

2021年1月20日 13:52

I am looking for a metric for comparing gene count tables. These are long columns of data (a few millions genes by a few dozen samples), with all non-negative entries, about 90% of which are zeros. The goal is to compare the performance of several tools/algorithms that these tables originate from, by comparing the resulting tables among themselves or with the expected counts (in a case of sumulates data). In principle, one compares on a sample-by-sample basis, but comparing different …

Topic: sparse spearmans-rank-correlation distance correlation

Category: Data Science

When should mutual information be used for feature selection over other feature selection methods like correlation, ANOVA , etc?

Ankita Talwar

2020年6月18日 06:23

I have a data set with categorical and continuous/ordinal explanatory variables and continuous target variable. I tried to filter features using one-way ANOVA for categorical variables and using Spearman's correlation coefficient for continuous/ordinal variables.I am using p-value to filter. I then also used mutual information regression to select features.The results from both the techniques do not match. Can someone please explain what is the discrepancy and what should be used when ?

Topic: spearmans-rank-correlation anova mutual-information feature-selection machine-learning

Category: Data Science

Slightly different results between scipy.stats.spearmanr and manual calculation

shin

2020年2月15日 15:51

I have the following dataset. When I calculate the Spearman correlation coefficient with scipy.stats.spearmanr, it returns 0.718182. import pandas as pd import numpy as np from scipy.stats import spearmanr df = pd.DataFrame( [ [7,3], [6,5], [5,4], [3,2], [6,4], [8,9], [9,7] ], columns=['Set of A','Set of B']) correlation, pval = spearmanr(df) print(f'correlation={correlation:.6f}, p-value={pval:.6f}') It returns this: correlation=0.718182, p-value=0.069096 However, when I tried to calculate it manually: df_rank = pd.DataFrame( [ [5,2], [3.5,4], [2,4], [1,1], [3.5,4], [6,7], [7,6] ], columns=['Rank of A','Rank …

Topic: spearmans-rank-correlation

Category: Data Science

About