How to test likelihood hypothesis on dataset?
How to test the following hypothesis? The larger the fare the more likely the customer is to be travailing alone.
Using the data below, how would one be able to test the hypothesis?
import seaborn as sns
# dataset
df= sns.load_dataset('titanic')
df[['fare','alone']].head()
fare alone
0 7.2500 False
1 71.2833 False
2 7.9250 True
3 53.1000 False
4 8.0500 True
UPDATE
#subset for alone = True
alone = df['fare'].loc[df['alone'] == True]
#import Wilcoxon test
from scipy.stats import wilcoxon
#run wilcoxon test
wilcoxon(alone, not_alone)
WilcoxonResult(statistic=10173.0, pvalue=2.8669052202786427e-28)
Topic hypothesis-testing data-analysis probability python
Category Data Science