How to fix my CSV files? (ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required)

I have tried to import two csv files into df1 and df2. Concatenated them to make df3. I tried to call the mutual_info_regression on them but I am getting a value error ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required. I have checked the dimensions of X, y, and discrete_features. They all seem okay.

Since the code works with other csv files (I have tested), I think the problem is with my csv files and not the code.

import numpy as np
import pandas as pd

df1 = pd.read_csv(WT_MDE.csv, index_col=0)
df1[Interact] = 1

df2 = pd.read_csv(M_MDE.csv, index_col=0)
df2[Interact] = 0

data = pd.concat([df1, df2])

X = data.copy()
y = X.pop(Interact)
discrete_features = X.dtypes == float

from sklearn.feature_selection import mutual_info_regression

def make_mi_scores(X, y, discrete_features):
    mi_scores = mutual_info_regression(X, y, discrete_features = discrete_features)
    mi_scores = pd.Series(mi_scores, name=MI Scores, index=X.columns)
    mi_scores = mi_scores.sort_values(ascending=False)
    return mi_scores

mi_scores = make_mi_scores(X, y, discrete_features)

Google Drive Link to The CSV Files

I would really appreciate if anyone could help.

Topic mutual-information scikit-learn python data-cleaning

Category Data Science


The problem seems to be with the discrete_features flag inside mutual_info_regression. If you remove it completely (or set it to 'auto') it will work fine!

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.