pass variable length argument to mstats.kruskalwallis

I am trying to run kruskawallis test on multiple columns of my data for that i wrote an function

var=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']

def kruskawallis_test(column):
    k_test=train.loc[:,[column,'SalePrice']]
    x=pd.pivot_table(k_test,index=k_test.index, values='SalePrice',columns=column)

    for i in range(x.shape[1]):
        var[i]=x.iloc[:,i]
        var[i]=var[i][~var[i].isnull()].tolist()

    H, pval = mstats.kruskalwallis(var[0],var[1],var[2],var[3])

    return pval

the problem i am facing is every column have a different number of groups so var[0],var[1],var[2],var[3] will not be correct for every column. mstats.kruskalwallis() take input vector which contain values of each group to be compared from a particular column.(as per my knowledge).

is there a better way to do this?

or what can i do pass different number of variable for every column for example:

if a column x have a, b, c, d, e levels how can i pass 5 vectors?

Topic anova non-parametric statistics machine-learning

Category Data Science


i solved using

def kruskawallis_test(column):
    k_test=train.loc[:,[column,'SalePrice']]
    x=pd.pivot_table(k_test,index=k_test.index, values='SalePrice',columns=column)

    for i in range(x.shape[1]):
        var[i]=x.iloc[:,i]
        var[i]=var[i][~var[i].isnull()].tolist()

#     s=""
#     for i in range(x.shape[1]):
#         s=s+'var{}'.format(i)+','

    m=()
    for i in range(x.shape[1]):
        m=m+(var[i],)


    args=(m)


    H, pval = mstats.kruskalwallis(*args)

    return pval

i did m=() for i in range(x.shape[1]): m=m+(var[i],)

args=(m)

by adding as many vars[i] in wanted to pass as argument of an empty tupple which i later passed to

mstats.kruskalwallis(*args)

*args helped mr pass variable length parameter.

please reply if you can suggest a better way.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.